tomo's blog

Importing and Exporting Drupal Taxonomy

Submitted by tomo on August 31, 2011 - 3:10am

There are multiple modules for importing and exporting Drupal taxonomies (Drupal switches between using the term "taxonomy" and "vocabulary" like a clinical schizophrenic). Some use CSV format (http://drupal.org/project/taxonomy_csv), others XML (http://drupal.org/project/taxonomy_xml), and still others use a PHP array (http://drupal.org/project/taxonomy_export).

Some of these modules use the same paths for the export and import pages but they are different modules and aren't compatible. If you have both Taxonomy Export and Taxonomy XML installed at the same time they will conflict.

Except for CSV, the other import/export modules need you to create documents in a rather wordy XML or PHP code format, which can actually be more work than entering the terms in manually. Some people may use taxonomy import/export for only the taxonomy definition rather than terms. It's sometimes unclear what happens if you want to re-import duplicate term names later.

What worked best for me was using Taxonomy Manager which gives you an improved UI for organizing terms within a vocabulary. I wish it made editing the core fields of a taxonomy more ajax-y but what it does provide is an easy way to add multiple terms at once, a textarea for pasting in a list of terms, and a way to select where the new terms will go. So you can paste in all the top level terms, then paste in all the 2nd level children of the 1st term and select the 1st term to indicate they will all go under it. As long as you don't have too many different branches, then this can be done fairly easily.

Fake Download Resume by Hacking TCP?

Submitted by tomo on August 30, 2011 - 12:51am

Just an idea. There are websites that allow resuming of partially downloaded files. But even if the browser or client (curl, wget, etc.) supports resume if the server doesn't support it then you're SOL. Some websites disallow resume intentionally. 

But perhaps there's a way to fake resume by telling the web server to send data that the client doesn't actually receive. So if I've already downloaded 100 MB of a file, then I would tell the server to rapidly send the first 100 MB of the file again but actually ignore it until data after the first 100 MB is sent.

TCP? TCP (of TCP/IP which runs the internet) is a "transmission control protocol" on top of IP meaning it sets up a connection between two computers which have a reason to exchange data (like a web browser on my computer and a web server out there on the internet). Part of this protocol says how one side sends a broken up piece of data and knows it was definitely received. The sender will send out pieces of data until the receiver either says to slow down, or the sender guesses that it's sending too fast for the receiver to receive. It's up to the receiver to respond saying it has received a piece of data, or all the data up to a certain point, and signal to the sender that it can either transmit data faster or slower.

You can think of it as two sentries on each end of a bridge. Each sentry allows cars to cross the bridge in one direction but to avoid the bridge from backing up (or breaking under load) the other side sends a bike messenger across to say "X number of cars have crossed; send more faster" or "it seems that car license number 1073 never made it across; send that load again".

But say we don't care about the first 100 MB of cars crossing the bridge and making it to the other side, because we already received their payload earlier.  What we want to do is convince the sender to as many cars as it can as quickly as it can, causing a traffic jam which will cause cars to fall off the bridge.  But we will lie and say that all the cars made it across.

The TCP equivalent would be to increase the receive window much more than usual, then periodically send ACK acknowledgement packets with fake sequence numbers, numbers which are ahead of what we've really already received.  We guess what sequence number the sender has most recently sent out and tell them we have already received that packet so they can send more.  We still need to know how much actual data the sender has sent so that we can tell when we need to go back to normal at 100 MB.

Could this really work? If it did, it would require more than a change to a browser or client.  It would require hacking TCP itself, which is inside the operating system.  Of course, with Linux of OpenBSD, it's quite possible to hack the networking code yourself to compile your own custom kernel, perhaps even as a reloadable kernel module. Then you would also need a custom client, like a modified curl/wget, that signals to the OS that a certain TCP connection should "fake download" for a number of bytes, and then the client must resume appending to a file once the OS detects that downloading has surpassed that threshold.

I can't be the only one who has thought of this, so let me know if this has actually been done before, or if not, why not?

Fix Seesmic Web Zombie Tweets

Submitted by tomo on August 28, 2011 - 5:00am

Seesmic Web has had a problem for as long as I've used it. I was hoping they would fix it on their own but as of the last update (which broke Seesmic Web for awhile) would take care of it. The problem is that tweets or other posts just up and disappear while you're in the middle of reading them. I use Seesmic to scan a days worth of tweets because of its compactness and automatic relatively responsive infinite scroll. But you know there's a problem when all of your timeline from "2 hours ago" to "14 hours ago" is missing.

There. I fixed it.

Install Seesmic Zombie Fix for Chrome (and maybe other browsers)

The problem seems to be that Seesmic periodically culls tweets that it thinks shouldn't be shown, maybe because they're too old and there's not enough space (who knows?). This Chrome-based userscript (it should work with FireFox using greasemonkey, and also natively in Opera) watches for the tweet-snatcher to do its reaping and then saves the zombie tweets before they end up in tweet-purgatory. The "saved" tweets will then show up at the top of your timeline with a pink background on the "ago" time.

Caveats: the "ago" time will no longer be updated automatically, and other javascript-y actions on the tweets will no longer be linked up. So the expand-contract on click no longer works. I worked around this by expanding all undead tweets. But at least you will be able to read tweets from hours ago without having them "rapture" on you.

Slice and Split a Subversion Repository

Submitted by tomo on August 28, 2011 - 4:44am

Moving a directory out of a SVN repository: I had to do this recently. It's not a black art. It's not exactly built into subversion, but can be done using basic svn tools. In fact, instructions are in the SVN book (relevant page).

It's a simple as:
1) svnadmin dump /path/to/repos > myrepos-dumpfile
2) svndumpfilter include somedir1 somedir3 somedir5 < myrepos-dumpfile > somedirs-dumpfile
3) svnadmin create repos; svnadmin load repos < somedirs-dumpfile

That will re-create your repository with just the directories you want to keep. Then you would svndumpfilter again on the original dumpfile for any repositories you want to create out of specific directories (e.g. somedir2).

If you follow the book's instructions about ignoring UUID when restoring (--ignore-uuid) then you won't be able to use the recreated repository from places where it's currently checked out, i.e. you would have to check out again. This might make sense if you had checked out directories that are no longer in the main repository. Naturally, for any of those directoryies, you will have to check out fresh using the new repositories.

Once you've restored, you'll notice that there are "phantom revisions" for revision numbers corresponding to commits for directories no longer in the tree. You could use "—drop-empty-revs --renumber-revs" when svndumpfilter'ing but I guess this would also screw up any trees that were already checked out.

Daily Twitter Posts - 08/19/2011 - 08/26/2011

Submitted by tomo on August 26, 2011 - 11:45am

08/25 23:30 Tool for online marketers: goo.gl short url analytics on right click. Testers please! http://t.co/BpTqsWU #
08/25 07:36 @jobnomade I think the bug is when you quickly hit 'more images' when it first loads, right? Thx! I got themebrain.com confirm lorem ipsum #
08/24 22:20 Password Generator now using canvas to download password image reminders. Get it at Chrome Web Store http://t.co/MBo828r @jobnomade for idea #
08/24 05:54 @jobnomade Cool idea! I'll have to look at HTML5 Canvas and other fun stuff to do that #
08/24 04:40 Moscow’s Wild Dogs Ride Subways To City Center In Search Of Food: http://t.co/EPeKI2P #
08/24 04:09 Please try out my Chrome plugin for easily making memorable passwords in many languages: http://t.co/WTSYP2L #
08/24 03:43 @betoneko Initially, I was googling for "mồng tơi" to put on "mông tôi" and I got "Mongolia" #
08/24 03:21 @NBNQ @jon7b Translation of words depends on context. Can you guys see Mongolia here? http://t.co/CCchwQJ #
08/23 23:40 Google Translates "Mông tôi" (my butt) as "Mongolia". (Not to be confused with a Mongolian butt spot) #
08/23 07:39 @barijoe I would imagine Indonesia's links to Singapore's Amazon EC2 would be pretty fast, no? #
08/23 05:03 Horns on taxis should be mechanically disabled/silent unless the foot brake is being depressed #
08/23 04:53 MacBook warranty just expired. Figures, 3 things break on it at once #
08/23 04:50 @dylanduong Never shared pics. HootSuite is solid for posting, but I like @seesmic for scanning a day's worth of tweets quickly #
08/23 03:26 @Seesmic Your Seesmic Web has been 404 for a week now. Hootsuite is working nicely, other suggestions? #
08/22 23:50 Vietnam ranks as the most financially attractive country for offshoring, BUT... http://t.co/wT2vWmA #
08/22 10:01 @jon7b I used that tutorial but saw this today: http://t.co/Wvq30nY. Then browse source examples by extension permissions. I use vim. #
08/22 09:00 Anyone have ideas for a Chrome plugin? I've been in the extending mood all weekend #
08/22 08:56 @nguyenhimself Ah, neato. DDoS is more terrorism, not hacking, and also a shitty use of Vietnam's limited overseas bandwidth. #
08/22 08:54 @jon7b Good point. All mail servers deliver to part before +, useful for filtering. But only GMail makes . insignificant for ID theft? #
08/22 08:06 @nguyenhimself Do you recall if it was trolls or people sharing other knowledge too? #
08/22 08:05 Periods in GMail usernames aren't significant. Add, remove, move the dots around, you still get mail! #
08/20 07:01 @tropixblue Unfortunately, censorship apologists would say that it's the same in every country #
08/20 07:00 @caligarn Might I suggest raincoats and backpacks with various Vietnam driving tips #
08/20 05:32 Title inflation: "Revolutionary Martyr" in Vietnam now meaningless: http://t.co/oxYc2iY #
08/20 05:21 Intriguing tweets on censorship in Malaysia, ethical hacking, and social enterprise coming via #TEDxKL today #
08/19 03:59 RT @theeconomist: Berlin has been overtaken by a strange wave of car-burning. Last night nine cars went up in flames http://t.co/ab1p5PY #

I have turned the Correct Horse Battery Staple post's Foreign Language Random Password/Passphrase Generator into a Google Chrome extension.

Here is what it looks like:

And in the Chrome web store:

Chrome App Store screenshot of Multi-Lingual Password Generator

Go install it and easily generate a secure and memorable passphrase anytime you need it!

Daily Twitter Posts - 08/12/2011 - 08/19/2011

Submitted by tomo on August 19, 2011 - 11:45am

08/18 12:31 I'll be flying to SFO on September 6 and returning to Vietnam via Tokyo on the 12th. No thanks to Hipmunk showing bunk flights. #
08/18 10:51 Are Groupon clones stalling out in Vietnam? http://t.co/ZQoHbVz #
08/18 06:32 [email protected] Anyone who claims that Khmer is not a tonal language has never tried to order squid and found out they were asking for face. #
08/17 09:49 Tuoi Tre News: Every article ends with "Police are investigating" / http://t.co/Vi1RlJY #
08/17 09:27 Super Mario: How it really went down (indie film trailer) http://t.co/dPBL5F8 #
08/17 08:43 @shakkabrutha @NicholasMarx @hcmctoday Thanks for the reassuring answers! $80 to staple some blank paper #firstworldproblems #
08/17 05:47 Any Americans know how long it takes to get extra pages added to a passport (especially at the consulate in HCMC)? #
08/17 03:06 @tamkaizen Don't worry, we will all retweet the bad news into your Twitter stream #
08/16 06:05 @caligarn In the race for economic development, Vietnam has completely lost touch with social/moral development #
08/16 04:29 Will Google open a Google Store? I expect a Chinese person to have done so already #
08/16 04:15 @betoneko @careyz Folks are bombarded by daily deals, learning that each 1 isn't so great, hearing of ripoffs. But a sucker is born every... #
08/16 04:08 @barijoe My name is available for the low cost of $19.99.95! #
08/16 03:35 @caligarn @jon7b @dynamicscholar 5 years ago, tablets were the same as laptops, running Windows. iPad dumbed it down. #
08/16 03:24 @PedroInSaigon Vietnamese people are too busy setting fire to stuff and stealing things to riot over that #
08/15 12:03 @jon7b You can make a sentence out of the 4 random words by adding words. But using 'is' for most passwords would be bad #
08/15 12:01 @jon7b If nobody knows they're not random then ok :) Problem is if anyone thinks you're using a weaker scheme. Just like using 1,0 for i,o #
08/15 11:45 @jon7b Actually, since there are much less grammatically correct 4-word sentences, it would be much easier to find by brute force #
08/15 11:15 Dammit. Foreign language passphrase generator working again: goo.gl/w5cCo #
08/15 04:36 Build a safer memorable multi-lingual password here: http://t.co/2ZET8V8 #correcthorsebatterystaple #
08/14 02:35 @barijoe You may be interested in a small Hootsuite ad blocker I made for Chrome: http://t.co/B1Q7spt #
08/13 01:13 @caligarn Ebay has been in Vietnam for years, but Paypal only recently #
08/12 06:31 @kennynguyenus 7?! Was rope involved? They passed in an instant so no pics #
08/12 06:11 Yes! I saw 6 people on 1 motorbike! They're like Vietnamese phone booths. #
08/12 05:21 Hey, you remember color.com? #vietkieu http://goo.gl/kP44I #

Police are investigating.

Submitted by tomo on August 17, 2011 - 5:53pm

If you read much English-language Vietnamese news online, you'll often read this conclusion and chuckle silently: "Police are investigating."

These are crime stories and you may imagine to yourself that police are investigating, how they're looking for clues (cue CSI: Saigon), and at what point they conclude their investigation conclusively. (There are far fewer followups in the papers.)

There's a new Tumblr about this phenomena of police investigating things. A sample:

[Sorry, the link went down.]

Police are investigating all sorts of claims. Even children's arithmetic puzzles.

Seriously, every quote from the police: "we are investigating". Here is a bunch of random samples from recent news articles at Tuoi Tre:

Read the rest of this article...

Inspired by XKCD, this is a password generator for those of you who know English and Vietnamese or another language. Once a random set of words in your languages has been generated, images for those words will be shown to help you visually remember your new password. If the random password seems too hard to remember, you can always spin the wheel a second time!

Each time you click, 4 random words from the selected languages will be loaded. I chose the number 4 so as to not overload Google Image search, so you may want to run it twice to get 5 or more words for added security. I find that the images help to visually remember the password.

If you still want a password like "!Agt:m%p>" then it's also an option below.

Choose Languages

English

Vietnamese

Japanese

German

French

Your Random Password

Click that button up there!

Or use this harder to remember but shorter string of 9 characters

The other day there was an XKCD strip about password security. The idea is that we've been trained over the years to use passwords like 'Tr0ub4dor&3' because they mix upper and lower case, use numbers, and special characters. But a password like that is based on a common English word using a common substitution pattern (l33tsp34k) of letters for numbers and is much easier for a hacker to guess than four random words like 'correct horse battery staple', which is longer but much easier to remember.

A good password should be random. Humans aren't random and 'Tr0ub4dor' looks random enough but it isn't. Even translating the word into a foreign language is by itself weak. Generally, if you come up with the password yourself then it's not anything close to random.

Plenty of software exists to come up with passwords made up of random characters. The problem is that these passwords weren't meant to be memorized. Writing your password down somewhere sort of defeats the purpose.

So four random English words makes a pretty good password, but is still hard to remember if they are obscure and unfamiliar words. Out of the over one hundred thousand words in an English dictionary a few thousand are commonly used.

So a few thousand English words are generally useful. But those of us who are bilingual can basically double the size of the vocabulary used! This foreign language random password generator seeks to take advantage of that numerical weapon, and with a large number of possible languages (and even more language combinations), even if a hacker got an encrypted password file it would be as hard to crack as a random 9-character totally impossible to remember string.

You can increase the security of your password further by using a "salt" random string (non-dictionary word) that you remember and always use with your passwords, and by adding punctuation in one of the words.

UPDATE: There is now a Chrome extension that makes creating passwords on the fly really fast and easy! Check out the Correct Horse Battery Staple Google Chrome Extension

Daily Twitter Posts - 08/05/2011 - 08/12/2011

Submitted by tomo on August 12, 2011 - 11:45am

08/11 06:17 @TylerWatts @officespace_sg To avoid unnecessary ridicule it's probably best to avoid Vietnamese names ending in 'ng' 'c' (and more) :) #
08/11 05:28 @OfficeSpace_SG Since 'j' can sound like 'zh' or 'y' in other languages, it could replace 'd'. Thus 'jinh' vs 'dinh', 'jung' vs 'dung'. #
08/11 05:00 @OfficeSpace_SG Can we replace 'd' with 'j' (or 'dj') and get rid of 'd stroke'? Gov't can't stop kids from using new, foreign words #
08/10 17:46 "Groupon updates IPO filing, admits it's unprofitable" http://goo.gl/55jo8 Failed to fool SEC, will they postpone IPO like others? #
08/10 10:22 Dow down nearly 400 points already! RT @Benzinga: Yesterday's gains are gone. Face ripper rally off. #
08/10 10:05 Studying Vietnamese by reading jailed French-Viet blogger Prof. Pham Minh Hoang aka Phan Kien Quoc: http://goo.gl/mafU5 #
08/10 05:41 Have a backup in case Anonymous takes down Facebook on Nov 5: http://goo.gl/8WtBW #
08/10 04:58 RT @AsCorrespondent: 7 carjackers killed in Philippine shootouts http://bit.ly/o0VLvv #Philippines #Asia #News #
08/10 04:51 Most dangerous countries to be female: Afghanistan, Congo, Pakistan, India, Somalia http://goo.gl/xqFMD #
08/09 03:46 A kitty cafe in Japan where you can rent kitty by the hour: http://goo.gl/HmkxW #
08/09 03:39 @careyz VN should first build normal-speed rail to Can Tho, commuter rail to Dong Nai/Binh Duong, and a freight rail system #
08/09 03:24 @OneVietnam I have data that indirectly shows Facebook is less blocked/easier to access nationwide in recent weeks #
08/08 17:36 What’s the Fastest Web Browser in the “Real World?” Chrome. http://tcrn.ch/ntHhjL #
08/08 15:58 @PedroInSaigon Many definitely don't care. I'd argue that ignorance helps breed apathy and complacency. #
08/08 15:56 @NBNQ Yup, we had it all then. And Bangkok (and maybe Singapore) benefited financially and rose from Saigon's fall. Morale or morals? :) #
08/08 12:10 Before, I would have been sad to find a snail shell in my soup. Now: hey free food! #
08/08 01:55 "One of the main causes of pollution is a simple lack of awareness on the part of local residents" #vietnam #
08/08 00:13 Looking back at Pre-1975 Saigon: http://goo.gl/BJJuU #
08/07 14:53 @dynamicscholar Yeah, agree the way to live peacefully is to not stress over things one can't change #
08/07 14:51 @michaelcoyote @jon7b @PedroInSaigon Feels good to vent, but we're just preaching to the choir. The ones who need to hear, need to in VNese #
08/07 03:23 @dynamicscholar @caligarn Looks like petanque. Game played in a lot of former French areas (US included) #
08/07 03:19 The futility of complaining about Vietnam in English. #
08/06 16:24 @lokimorgan hi Morgan! :) #
08/06 04:43 @mybigfatface I am impressed indeed, and think it would make an excellent Twitter profile pic! #
08/06 04:00 I finally understand why Westerners can't do the Asian squat: http://goo.gl/rU6Td #
08/05 06:01 Rep'n my hood RT @thisbigcity: New Post: In Awe of Vietnam – Stunning Photography from Ho Chi Minh City http://bit.ly/q6EJj1 @jodiehuynh #
08/05 04:36 Robocop is needed. RT @tuoitrenewsvn: Robbers besiege Ho Chi Minh City http://bit.ly/oUTzU2 #

Syndicate content
© 2010-2014 Saigonist.