Google code jam solution for alien language


In the 2009 qualification round there was a simple problem with a nice background story:

After years of study, scientists at Google Labs have discovered an alien language transmitted from a faraway planet. The alien language is very unique in that every word consists of exactly L lowercase letters. Also, there are exactly D words in this language.

Once the dictionary of all the words in the alien language was built, the next breakthrough was to discover that the aliens have been transmitting messages to Earth for the past decade. Unfortunately, these signals are weakened due to the distance between our two planets and some of the words may be misinterpreted. In order to help them decipher these messages, the scientists have asked you to devise an algorithm that will determine the number of possible interpretations for a given pattern.

A pattern consists of exactly L tokens. Each token is either a single lowercase letter (the scientists are very sure that this is the letter) or a group of unique lowercase letters surrounded by parenthesis ( and ). For example: (ab)d(dc) means the first letter is either a or b, the second letter is definitely d and the last letter is either d or c. Therefore, the pattern (ab)d(dc) can stand for either one of these 4 possibilities: add, adc, bdd, bdc.

solution with regexp

We have a set of correct words and patterns which should match the words. How many words are matched by every pattern?

words = [“abc”, “bca”, “dac”, “dbc”, “cba”]

The first pattern “(ab)(bc)(ca)” means, that the first character can be “a” or “b”, the second “b” or “c” and the third “c” or “a”.

The solution should print out how many words in the alien language match the pattern. After so many “match” and “pattern” you know the solution: regular expression! One pattern can be converted into a regular expression:

searchStr = line.replace(“(“, “[“).replace(“)”,”]”)

Here the complete solution uses the filter function again (read more about the functional part of python) to shrink the code:

import sys, re
fp = file(sys.argv[1])

#read params
(l, d, n) = [int(x) for x in]

#read words
words = [ for x in range(d)]

#read pattern
for i in range(1, n+1):
    searchStr ="(","[").replace(")","]")
    searchIt = re.compile(searchStr).search
    print "Case #%d: %d" % (i, len(filter(searchIt, words)))


25 words with 10 characters and 10 patterns to check:

time python > alien_small.out

real	0m0.120s
user	0m0.108s
sys	0m0.012s

5000 words with 15 characters and 500 patterns to check:

time python > alien_large.out

real	0m13.398s
user	0m12.821s
sys	0m0.316s

other solutions

I found (longer) solutions in other programming languages, feel free to read and comment them or offer an alternative or even better solution – we will link your article.

The rotate google contest in 15 lines

The rotate example nico last week reported was funny to solve: No rotating needed! Read the complete problem description in nicos article or in the google code contest page. Here my complete python solution with file handling and solution printing.

import sys
from re import search

fp = file(sys.argv[1])
for case in range(1, 1+int(
    n, k = [int(x) for x in]
    lines = [ for x in range(n)]
    #right-gravitation and joining to one line (delim is a line)
    s = "#".join(map(lambda x:"%%%ds"%n%(x.strip().replace(".","")), lines))
    result = 0

    #find one of the 4 variants: horiz, verti, slash, backslash for Blue
    reB = "(B.{%%d}){%d}B" % (k-1)
    if search(k*"B", s) or search(reB%(n-1), s) or search(reB%n, s) or search(reB%(n+1), s):

    #find one of the 4 variants: horiz, verti, slash, backslash for Red
    reR = "(R.{%%d}){%d}R" % (k-1)
    if search(k*"R", s) or search(reR%(n-1), s) or search(reR%n, s) or search(reR%(n+1), s):
        result= result+2
    print "Case #%d: %s" % (case, ["Neither", "Blue", "Red", "Both"][result])

There are 15 effective lines of code for the solution. And some lines are used for imports or reading from the input file (comments and spacer not counted) – cool?

handling the gravitation

To understand the code read the problem description in nicos article or in the google code contest page. I will describe now the two steps for the solution. First the play field have to sorted like a gravitation to the right. All dot character are handled like spaces and the capital letters goes to the right border.

We start with a n by n matrix:

..R          ..R
.R.   ->     ..R  
R.B          .RB

To remove the dots simple replace it with nothing as a standard string manipulatin method. To fill it from the left use the common used string formating well know from printf.

newLine = “%3s” % lin.replace(“.”, “”)

To do this with the length n use this obscure-looking multi step format string:

newLine = “%%%ds” % n % line.replace(“.”, “”)

And to do this with a complete list of lines use a list comprehension (read more about functional programming in python).

lines = map(lambda x:"%%%ds"%n%(x.replace(".","")), lines)

This is done in line 10 in the example (with joining to one string).

find the winner

Second part is to find the k=4 characters in a line:

_RBBB   -> "RBBBB#_RBBB#__RBB#___RB"

I concat all lines to one string to find the four different directions with regular expressions: “RBBBB-_RBBB-__RBB-___RB”

  • The horizontal line is easy:

    or with k:

  • the vertical line can found by 4 B’s and n characters (one line up):

    or shorter:

  • the backslash line is the same like the vertical line, only with n+1 spaces
  • the slash line is the same like the vertical line, only with n-1 spaces

I prepared the regular pattern with k:

reB = "(B.{%%d}){%d}B" % (k-1)

and used the multi step format string for the different pattern strings (see example above in line 15).

That’s all!

Google Code Jam – Rotate

It’s time for some basic finger exercise. The Google Code Jam Rotate is very trivial, so relax and fire up your IDE.

I was a bit lazy, so there is no reading of the input sets, just a two-dimensional array and two functions


As the Google solution pointed out, there is actually no need to really rotate the 2dim array. Just push everything to the right, as if gravity would be to the right. That’s the same as rotating everything and keeping gravity towards the bottom. But we save a few lines this way.

So here is the “gravity from the right” code

public static void fakeRotate(char[][] board) {

    for (int i = 0; i < N; i++) {
        for (int j = N - 1; j >= 0; j--) {
            if (board[i][j] != '.') {
                // push to right
                int m = 1;
                while ((j + m) < N && board[i][j + m] == '.') {
                    board[i][j + m] = board[i][j + (m - 1)];
                    board[i][j + (m - 1)] = '.';

Checking for a winner

Now we have everything ready to look for a winner. I decided to start from the top left and walk through the game board step by step. I keep going right till I'm at the end, then go to the beginning of the next line.
Progressing this way, I only need to check in four directions. Towards right, bottom, diagonally upwards-right and diagonally downwards-right.

public static void checkForWinner(char[][] board) {
    boolean redWins = false;
    boolean blueWins = false;

    for (int i = 0; i < N; i++) {
        for (int j = 0; j < N; j++) {
            if (board[i][j] != '.') {

                // winning once is good enough
                if (board[i][j] == 'B' && blueWins == true) {
                if (board[i][j] == 'R' && redWins == true) {

                // check to the right
                int m = 1;
                while (j + m < N && board[i][j + m] == board[i][j]) {
                    if (++m == K) {
                        switch (board[i][j]) {
                        case 'R':
                            redWins = true;
                        case 'B':
                            blueWins = true;

                // check bottom
                m = 1;
                while (i + m < N && board[i + m][j] == board[i][j]) {
                    if (++m == K) {
                        switch (board[i][j]) {
                        case 'R':
                            redWins = true;
                        case 'B':
                            blueWins = true;

                // check diagonal bottom
                m = 1;
                while (i + m < N && j + m < N && board[i + m][j + m] == board[i][j]) {
                    if (++m == K) {
                        switch (board[i][j]) {
                        case 'R':
                            redWins = true;
                        case 'B':
                            blueWins = true;

                // check diagonal top
                m = 1;
                while (i - m >= 0 && j + m < N && board[i - m][j + m] == board[i][j]) {
                    if (++m == K) {
                        switch (board[i][j]) {
                        case 'R':
                            redWins = true;
                        case 'B':
                            blueWins = true;





That's a bit of copy and paste, maybe you have a better solution.

That was already it. Not much to say, is there?

If you're bored you can do some improvements and share it in the comments.

  • optimize the code for checking for a winner, e.g. stop checking if the space is not sufficient for K stones to win
  • use multiple threads to check the board
  • write a parser to read the input file from the Google Code Jam

I hope you had some fun, questions and improvements are welcome. If you want to submit a patch write me using the contact form.

The code can be found as usual on github

Google Code Jam – Theme Park

Theme Park is another good example of a problem which can be solved with a fairly simple algorithm. The simple version will solve the small input set without a problem, but will run almost infinitely with the large dataset.

You can read the problem on the Google Code Jam site: Theme Park

simple solution

The simple solution is to put the groups in a static array, use a pointer that “wraps around” using modulo and fill the roller coaster until it’s so full, that the next group does not fit in, then let it ride earning income accordingly to the seats taken.

This is a correct solution, but unfortunately a bit slow. The input set can have a roller coaster with up to 109 seats and 108 rides and groups as large as 107.

So going through each entry every time step by step would use a lot of calculation time. The be precise, that would be O(NR).

simple optimization, brining it down to O(N)

As you can see, the roller coaster runs way more often than groups stand in line. So to some point, you will reach a group which has already been in the roller coaster as first group. And that’s the exact situation we’re looking for.

Every time we fill the roller coaster, we store the position of the first group who enters the ride, remember how many people went in and store the position of the group which will fill the roller coaster first on the next run. If we then check before the run, if this group has been in first, we already know the result and we know, that from thereon we know the subsequent groups.

From this point on we can iterate on our stored results and performance increases to O(R), that’s a speed up of at least 103.

3 caching steps to boost your webservice by x10

I am running a IP-Geo-location demo in my App Engine. After many silent months I got some traffic on it (and hitting the daily CPU limit). Some one would kick the “traffic monster”, after all it’s starts only as a demo, other ones (like me) will try to optimize the GEO-location script.

App Engine overquota

When you read the message “This Google App Engine application is temporarily over its serving quota. Please try it again later” (or simply a HTTP-Status 503) you can activate the App Engine billing option. Or it’s time to start caching in the different application levels:

introduction for the caching article

What is the IP-Geo-Location webservice? After including a small javascript you can access to the geo location of a normal visitor based on his IP address. The service can be a door opener for different landing pages or localized content. You can forward the visitors (by identifying his home location/city) to special local sites or offer local ads.

other articles about the geo location

The described caching steps works perfectly if you do not need to invalidate your cache. This detailed example only reads from external resources and can use long caching times. After some improvements I got a speed up of factor 10! Or in other words: from 50.000 requests/day I increased the App Engine capacity to 500.000 requests/day (and hitting the cpu limit again …) All logged times come from the logfiles and profiling with Appstats helps with the sequence diagrams (generated live).

The profiling images are examples of the google Appstats tool in the App Engine and show live examples of requests. Its slow down your engine up to 10% (on high load) and saves all statistics in memcache. There are different values in the graphics:

  • blue cpu time: time “wasted” in my own python code
  • red api cpu time: time needed by the services (memcache, database, fetch and many more)
  • requests total time: time for the request, can stopped in the browser

What we have to optimize

The basic webservice requests have to do the following steps:

  1. accept a GET request (param is the calling IP) on a javascript-url
  2. ask up to 4 web-service for data
  3. generate resultString as javascript code
  4. return the resultString

The result is a dynamically generated javascript wich contains only the data. A second static script with functionality over this values is added as a static file and google does all the caching stuff for static content. There are functions for finding the next city in a given set or calculate the distance of the visitors home and your given location.

var com = com||{};  
com.unitedCoders = com.unitedCoders||{};  
com.unitedCoders.geo = com.unitedCoders.geo||{};  
com.unitedCoders.geo.ll = [{"lat": -21.78330, "city": "Araraquara", "name": "maxmind", "long": -48.16670, "country": "Brazil"}];

    Costs: every request results in 4 calls to webservices

  • not much cpu, no local storage, long running (depend on slowest webservice)
  • worst case: up to 2 seconds, but on average 100 ms cpu-time

cache external resources

Notes: Not every used webservice will give results for every called IP and the free daily amount of web-service-calls are limited. Second the assignment between IP and location will rarely change – an ideal setup for caching on application side. A long time cache like a local database with timestamp is a good choice for saving requests to the webservices and a short time cache for faster lookups.

  1. accept an GET request (param is the calling IP) as a javascript-url
  2. check if result is in local database
  3. otherwise ask up to 4 web-service for data
  4. save the results in the local database
  5. generate resultString
  6. return the resultString

The profiling image shows the result. It tests if the IP is in the local database – this is implemented as a keyname call (in sql syntax a request to the primary index) and very fast. If not it starts 4 parallel requests to the webservices and transforms the results. Then it saves it in the database and returns the result. This does require expensive resources and can have long request time (depending on the webservices).

Appstats profiler for fetching and saving into db

Personal experiences: After running this with many visitors, my database filled with known values and the count to call the webservices will go down. I got after 10 million requests (from one site in brazil) an 97% hit chance (this will highly depend on the calling visitors and the saved data!!!) in my database!

    Cost: 3% of requests fetch from 4 webservice and writing into database

  • longtime usage on the storage
  • real=642ms cpu=700ms api=273ms overhead=1ms (8 RPCs)

If the result is in the local database the request can use the saved result and therefore is fast. I used a keyname (like primary index on sql databases) to find the record for the called IP. The time for finding the result is nearly constant for the first 100.000 records I tested in my App Engine instance.

    97% of requests reads from database

  • normal time, some cpu, many memory
  • real=59ms cpu=67ms api=8ms

cache internal resources

Using local caching or the fast memcached is the classic way – every call to a database can be cached in memory too. Saving data in the memcache is fast and can be made asynchronous – no waiting time. The memCacheD is a standard service in the App Engine and should be faster than the distributed bigtable.

  1. accept an GET request (param is the calling IP) as a javascript-url
  2. check if calling IP in memcache -> return it
  3. check if result is in local database-> return it
  4. otherwise ask up to 4 web-service for data
  5. save the results in the lcoal database
  6. generate the resultString
  7. dump the resultString in memcached
  8. return the resultString

Appstats profiler for fetching and saving into db

The complete and worst case request is only a little bit more expensive than the basic call. It is asking the memcache and than the local database. If no data is found calling all 4 webservices. The result has to be transformed and saved in local database (saving is not asynchronous in the App Engine) and memcache.

Appstats profiler for reading from cache

If the request found a hit in the memcache all is very fast!

Appstats profiler for reading from db

If the script found the record in the database, it has to save it in memcache and return the result. See here for some python code lines for doing this in the App Engine. The function computeData tries to read the record from the database. If not found start the webservice calls, saving into database and returning the record.

def get(request, response):
     resultCacheKey = "geo_data(%s)" % request.get("ip")
     resultStr = memcache.get(resultCacheKey)
     if resultStr:
     #some code
     result = computeData()
     resultStr = self.pageTemplate.render(Context(result))
     memcache.set(resultCacheKey, resultStr)

  • 3% requests to 4 webservices
  • 67% hit rate in database (long-time cache)
  • 30% hit rate in memcache (short-time cache)
    • fast, minimal cpu, some memory
    • real=6ms cpu=39ms api=0ms

Tip: Don’t cache the database result directly (with wrapper or annotations). Try to dump the webservice result in memcache as late as possible. You next memcache hit does not need to convert the result data.

use the browser cache

Some calling browser/IPs (not only anonymous proxies with the same IP) come several times an hour – I don’t know why. The result depends on the calling IP – not by the url – it can’t be cached for all caller transparently. I defined in the http response header one hour as the private cache time – in this period the location should not change dramaticaly and the browser should load the script from his browser cache.

Cache-Control: max-age=3600, private

After one hour the second identifier would be the IP itself, transported as etag header. If the IP did not change since the last call the status code will return 304 (s. rfc 2616) “Not Modified”. In this case the webservice has nothing to do – the browser cache is valid. The request answer means: your cache is valid – use it (since the last call with the given etag).

response.headers["Cache-Control"] = "max-age=3600, private" 
response.headers.add_header("Etag",  request.get("ip")) 
if request.headers.get("If-None-Match", "") == request.get("ip"):

App Engine requests stats with activated cache control

The activated cache-control option decreased the request count by 30% and increased the capacity for the app! The cpu load did not dramatically change because the old requests were caught by etag check or by memcache cache hits.

Requests with an adequate etag will only cost checking the header and is reported with the minimum cpu ms (19ms) in the log files. After active cache-control settings every 20 requests an etag logline appeared. The example with the Etag matching can also be used on url-parameters or cookie-values.

Check your webservice with the cache header test tool from Resource Expert Droid or read more caching hints in

Request rate before and after optimization

App Engine requests stats without optimizations

Here you can see the daily statistic on the App Engine dashboard with a 2 requests/seconds maximum and 50.000 requests/day. I got some more traffic and the result was the Overload screen (started with this image in the article).

Google App Engine requests stats with some optimizations

With the prior noted improvements the free and basic Google App Engine runs with up 20 requests/second. The daily traffic of 400.000 requests/day can now be handled. Altering my App Engine code was fun and it was exciting to see if I can stay below the daily 6.5 hour cpu limit, or if the cpu limit was ok but the bandwidth limit will become the new challenge.


I hope this helps to find optimization options for your webservice on the server and client side! If you have more tips or questions, ask for it. If you want to check your website try out the yahoos yslow or google page speed plugins for basic client checks or the profiling tools Appstats on google app engine.

Android: Dealing with ListActivities, customized ListAdapters and custom-designed items

Due to its ListActivity class the Android SDK helps developers to create List-GUIs easily. Therefore the ListActivity uses a ListView. But what about customized lists? What about own list-designs? In this post I’ll try to show how to extend a standard ListView to create a custom list-design. To create your own list-design, you should define how an item in your list should look like. As an example I’ll use a list of Twitter-updates (also called „tweets“).

To define how a tweet should be displayed you can create a new layout-file (e.g. tweet.xml) in


. Its content could look like this:

This simple layout definition declares how a single tweet should be structured and displayed as an item in the list. To let the


use this layout a customized


must be set. To create such an adapter the class

<a title="Reference - BaseAdapter" href="">BaseAdapter</a>


can be used.


public class TweetListAdapter extends BaseAdapter {

    private List tweetList;

    private Context context;

    public TweetListAdapter(List tweetList, Context context) {
        this.tweetList = tweetList;
        this.context = context;

    public int getCount() {
        return tweetList.size();

    public Tweet getItem(int position) {
        return tweetList.get(position);

    public long getItemId(int position) {
        return tweetList.get(position).getId();

    public View getView(int position, View convertView, ViewGroup parent) {
        LinearLayout itemLayout;
        Tweet tweet = tweets.get(position);

        itemLayout= (LinearLayout) LayoutInflater.from(context).inflate(R.layout.tweet, parent, false);

        TextView tvUser = (TextView) itemLayout.findViewById(;

        TextView tvText = (TextView) itemLayout.findViewById(;

        return itemLayout;


The method


now returns a custom design for each list-item based on the tweet.xml. To use this new


within your


you should set it to




. Therefore the


-method is recommended.

public class TweetListActivity extends ListActivity {

    public void onCreate(Bundle savedInstanceState) {
        List tweetList = loadTweets();

        ListAdapter adapter = new TweetListAdapter(tweetList, this);


That’s all. If you now run your Activity within the SDKs emulator every item should look like defined in


. If you like to you can extend the


on your own and add things like an image and/or the tweet’s posting-time. If you do so the


-method of


should be extended too to fill all nested fields.

Google Code Jam – The Snapper Chain

The Google problems are always nice, sometimes even a bit too tricky. This years entry problem was quite nice with some difficulties you could run in.

You need to read the problem first, to follow the article, so here’s the link:

If you’re a hands on programmer first you might just want implement the chain without looking at the problem more extensively. So you might end up with something like this:

                        // snapping
// snapping
boolean toggle = true;
for (int j = 0; j < snaps; j++) {
	toggle = true;
	for (int k = 0; k < snapperCount; k++) {
		if(toggle == false){
		if (toggle == true) {
			snapper[k] = !snapper[k];
		if (snapper[k] == true) {
			toggle = false;


// we need power on all snappers
boolean power = true;
for (int j = 0; j < snapperCount; j++) {
	if (snapper[j] == false) {
		power = false;

Continue reading

What are yoda conditions?

Yoda was a great teacher except for his word sequence. For programmer exists the yoda condition :

if (value == 42) { 

As yoda condition written:

if (42 == value) { 

I found some discussions on and collected pro and contras.

prevent an assignment

if (value = 42) { 

If you make this typo in javascript the condition is true and value is changed. In C your compiler would give you an warning, as yoda condition this would not work and your compiler will throw a compiler error.

if (42 = value) { 

null checks

if (myString != null && myString.equals(“yes”)) { 

This typical double check in java is tiring and will be shorter as yoda condition:

if (“yes”.equals(myString)) { 


With javascript the in-operator can find an attribute in myObject:

myObject = {name: "Christian Harms" };
if ("name" in myObject) { 

The variant using value as first part is ok. A more object oriented variant will ask the container if the attribute is true.

if (myObject["name"]) { 

The same example in python with a list (because the example wont work with javascript Array):

myList = [1, 2, 3]
if 1 in myList: 

The yoda condition can used with a list method:

myList = [1, 2, 3]
if myList.count(1): 


yoda says: “Try not. Do or do not, there is no try.”

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.