Friday, July 03, 2009

A kid who could read my mind

Last night, I was with a group of people I didn't know. In the group, there was a kid who claimed that he could read the minds of others. I did not believe it. So I approached him. He covered both my ears with his palms and asked me to think of something. I thought of the number '5' and he immediately said aloud, "5". I was shocked. How could the kid know that I thought of that number? To make sure he did not simply get lucky, I thought of another number, '7' and he immediately declared that I thought of '7'. I thought of a few more numbers and he correctly announced all the numbers I thought of. I was completely befuddled. The people around were equally perplexed. They felt that both of us were working together to hoodwink others. An old man in the group who seemed intrigued by all this approached us and asked us to perform more experiments in a structured manner. He proposed that the numbers be generated using a pseudorandom number generator of a scientific calculator and written down on a paper. He asked me to read the paper and then think of the numbers written there and see whether the kid could then read my mind.

While he was preparing a list of random numbers, I was trying to guess how the kid was reading my mind. He couldn't be correct every time just by luck. He seemed to have direct access to my mind. However, that would be a miracle. I kept on thinking, coming up with various hypotheses and discarding them. By sheer luck, I thought of a model involving one random number generator (RNG) connected to two displays. It provided me some clues. In the model, the output of the RNG appeared immediately on the first display, but only after a delay of 1 second on the second display. Now, it would seem that the first display was predicting the output of the second display. I started thinking more in that direction. What could possibly be the source of the thoughts that was accessible to both?

After a little thought, it occurred to me that it could be my own mind. Probably the kid and the people I could see were simply creations of my mind. Perhaps, I was asleep and dreaming. I wanted to test the hypothesis. If it was only a dream and my mind was creating all the experience, I should have been able to read the minds of others. I had no clue how the experience of reading others' minds would feel like but I went ahead to test the hypothesis. I asked a person to help me. He agreed. I covered his ears with my palm and asked him to think of a number and suddenly the number '17' echoed in my mind. I was thrilled. I asked him whether '17' was the number he thought of. He confirmed that it was indeed the number he had thought of.

Unfortunately, I could not stay in this state of lucid dream for long. I woke soon after I tested my hypothesis a few times. In case, you find lucid dreaming intriguing you should watch Waking Life, a very thought provoking movie that shows the experience of a man trapped in a persistent lucid dream state.

Saturday, May 16, 2009

ADAC and HE puzzles from GEB

I have been reading Gödel, Escher, Bach: An Eternal Golden Braid by Douglas R. Hofstadter since the last Monday. The book alternates between chapters and dialogues. In the words of the author:
The long and the short of it is that I eventually decided - but this took many months - that the optimal structure would be a strict alternation between chapters and dialogues. Once that was clear, then I had the joyous task of trying to pinpoint the most crucial ideas that I wanted to get across to my readers and then somehow embodying them in both the form and the content of fanciful, often punning dialogues between Achilles and the Tortoise (plus a few new friends).
After the second chapter (Chapter II: Meaning and Form in Mathematics) there is a dialogue between Achilles and the Tortoise on telephone. The title of the dialogue is Sonata for Unaccompanied Achilles. Achilles is the only speaker, since it is a transcript of one end of a telephone call. The Tortoise is at the far end of the call. The sentences spoken by the Tortoise at the other end are not present. This makes it very interesting as we keep guessing what the Tortoise might have spoken.

It starts in this manner.
Achilles: Hello, this is Achilles.
Achilles: Oh, hello, Mr. T. How are you?
Achilles: A torticollis? Oh, I'm sorry to hear it. Do you have any idea what caused it?
As the dialogue proceeds, they share a few puzzles. Here is the first one from the Tortoise.
Achilles: A word with the letters 'A', 'D', 'A', 'C' consecutively inside it ... Hmm ... What about "abracadabra"?
Achilles: True, "ADAC" occurs backwards, not forwards, in that word.
Achilles: Hours and hours? It sounds like I'm in for a long puzzle, then. Where did you hear this infernal riddle?
Here is the second one from Achilles.
Achilles: Say, I once heard a word puzzle a little bit like this one. Do you want to hear it? Or would it just drive you further into distraction?
Achilles: I agree - can't do any harm. here it is: What's a word that begins with the letters "HE" and also ends with "HE"?
Achilles: Very ingenious - but that's almost cheating. It's certainly not what I meant!
Achilles: Of course you're right - it fulfills the conditions, but it's a sort of "degenerate" solution. There's another solution which I had in mind.
Achilles: That's exactly it! How did you come up with it so fast?
Achilles: So here's a case where having a headache actually might have helped you, rather than hindering you. Excellent! But I'm still in the dark on your "ADAC" puzzle.
SPOILER ALERT: If you want to think on these puzzles, don't read further as there are spoilers below.

It didn't take much time for me to solve the puzzle because I cheated with the word list file available in Debian 5.0.

Here is the output of my cheating.
susam@nifty:~$ grep adac /usr/share/dict/words
headache
headache's
headaches
susam@nifty:~$ grep ^he.*he$ /usr/share/dict/words
headache
heartache
So, the answers to both puzzles seem to be 'HEADACHE'. Read the last sentence, in the dialogue I have shown above, again. It makes sense now as Achilles says that having a headache might have helped the Tortoise.

Later in the dialogue the Tortoise offers 'figure' and 'ground' as hints to the 'ADAC' puzzle.
Achilles: Well, normally I don't like hints, but all right. What's your hint?
Achilles: I don't know what you mean by "figure" and "ground" in this case.
Achilles: Certainly I know Mosaic II! I know ALL of Escher's works. After all, he's my favorite artist. In any case, I've got a print of Mosaic II hanging on my wall, in plain view from here.
Achilles: Yes, I see all the black animals.
Achilles: Yes, I also see how their "negative" space - what's left out - defines the white animals.
Achilles: So THAT's what you mean by "figure" and "ground". But what does that have to do with the "ADAC" puzzle?
Achilles: Oh, this is too tricky to me. I think I'M starting to get a headache.
The famous painting discussed in the dialogue can be found here: http://www.worldofescher.com/gallery/A30L.html. It can be seen how the black animals form the figure or the positive space and how the background or ground or negative space beautifully fits all the white animals.

I was unable to use this hint to solve the puzzle. But after cheating and finding the answer I could make sense of the hint and understand how 'figure' and 'ground' lead to 'HEADACHE'. The first puzzle has 'ADAC' in the question. Let us consider 'ADAC' as the figure or the positive space. Now, if we remove 'ADAC' from 'HEADACHE', we are left with the ground or negative space, which consists of 'HE' in the beginning of the word and 'HE' in the end of the word. The figure is used to make the question in the first puzzle. The ground is used to make the question in the second puzzle.

An interesting question is: What was the first answer from the Tortoise that Achilles found very ingenious but degenerate. I believe, it is 'HE' as this word begins with 'HE' and also ends with 'HE'.

The funny thing is that both of them asked two puzzles to each other without knowing that the answers to them are same. This is exactly what happened when a colleague of mine and I challenged each other with combinatorics puzzles. I wrote a blog post on this here: Combinatorial coincidence.

Saturday, April 11, 2009

Butter and Mashed Banana

Today, I went to Ranga Shankara to watch a play called Butter and Mashed Banana. Here is the description of the play from the Ranga Shankara website.
Butter and Mashed Banana is a hilariously funny take on freedom of speech in our country. It goes from three actors in real life to three characters on stage caught up in a frenzy of activity ranging from International best sellers that are made into Oscar winning films to instant stardom and conflicting ideologies. The story revolves around a boy who goes on to become a world-famous writer, but faces rabid opposition to anything he tries to say or do back home in India. Interspersed with music and movement the play tangentially takes digs at ideology and censorship, among other things. Although its intent is to entertain, the play is a result of dissatisfaction over how poorly the freedom of speech is protected. Like it says in the play, there is always someone in this country waiting to tell you… "Don't you dare say that!" Created as a satirical farce, Butter and Mashed Banana uses dance, movement and live music to heighten the funny elements.
I have spent my entire childhood at a small town called Rajgangpur where my father works in OCL India Limited. There used to be plays, cultural performances and pravachans in the company auditorium where I used to go and enjoy. Also, cultural activities in my school, Dalmia Vidya Mandir, almost always included an English play, a Hindi play, a solo song, a group song and a classical dance from each house (We were divided into three houses: Vivek, Vikas and Vinay). Having lived in such an environment, I still retain an interest in plays as well as classical songs and dances. When, I came to Bangalore, I knew about Ranga Shankara and many times I have looked up the various plays performed there on the web. However, I was always too lazy to make an effort of booking a ticket for myself and going there alone.

Luckily, today, Wesley asked me to join him and other friends to the play. So, we arrived at Ranga Shankara at 7:20 PM, 10 minutes before the play would start and found some seats for the five of us. We realized that we should reach there at least half an hour before the scheduled time to get better seats, the ones that are aligned parallely to the stage and provide the best view. The play started almost on time and I enjoyed most of it. There were three actors and one guitarist. I found it very different from the plays I have watched before. Each actor was not playing just one character. The actors danced and moved around the stage assuming different characters in different situations. When the play began, one actor played the role of a father and the other played the role of a mother. The third actor played the role of a new born baby. He was also the narrator. Half an hour later, when the new born baby had grown up to be an Oscar winner, the first two actors played the role of journalists asking questions to the third actor after the Oscar ceremony was over. Five minutes later, one of the first two actors was making threatening calls to the Oscar winner.

The play made a lot of sense to me. These days, when I am with Wesley, most of the funny things I say are satirical reflections on how some people try to define what our culture is and how these people harass those who do not agree with or follow their definition. The play was trying to express a similar idea and so, I enjoyed it a lot.

From the Ranga Shankara calendar it seems that they have a play on each day of the week except Monday. The one on the weekends are mostly in English. So, I will try to go there more often, probably every weekend if I get some company from friends, colleagues or anyone else.

Sunday, March 29, 2009

Apache Nutch 1.0 Released

Today, we received an announcement from the Nutch committer, Sami Siren that Apache Nutch 1.0 has been released. An extract from the announcement:
Apache Nutch, a subproject of Apache Lucene, is open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats.

Apache Nutch 1.0 contains a number of bug fixes and improvements such as Solr Integration, new indexing framework and new scoring framework just to mention a few. Details can be found in the changes file:

http://svn.apache.org/repos/asf/lucene/nutch/tags/release-1.0/CHANGES.txt

Apache Nutch is available for download from the following download page:
http://www.apache.org/dyn/closer.cgi/lucene/nutch/nutch-1.0.tar.gz

I have been waiting for this release for a long time as I made some contributions to this project. These contributions were also my first contributions to an open source project. In 2007, when I was playing with the search engine in Infosys (my previous employer), I found a few things that could be fixed and improved. I submitted these fixes and enhancements to the Nutch community and they were committed to the subversion repository. Let me list my contributions from the CHANGES.txt file.
62. NUTCH-559 - NTLM, Basic and Digest Authentication schemes for web/proxy
server. (Susam Pal via dogacan)

77. NUTCH-44 - Too many search results, limits max results returned from a
single search. (Emilijan Mirceski and Susam Pal via kubes)

80. NUTCH-612 - URL filtering was disabled in Generator when invoked
from Crawl (Susam Pal via ab)

81. NUTCH-601 - Recrawling on existing crawl directory (Susam Pal via ab)
NUTCH-559 was my major contribution. I did this when I found that Nutch was unable to authenticate itself to the intranet sites which were protected with NTLM authentication scheme. I modified the module that deals with the HTTP protocol so that it could authenticate itself with configured credentials when challenged with authentication. While developing this, I also developed support for Basic and Digest authentication schemes. More details on this can be found in NUTCH-559 (JIRA) and the Nutch wiki entry on HTTP authentication schemes.

NUTCH-44 and NUTCH-612 were bug fixes. NUTCH-601 involved the removal of a minor irritant. In the days of Nutch 0.9, the crawler complained if a directory with the name 'crawl' already existed in the current directory. As a result, before beginning a re-crawl using the bin/nutch crawl command, we had to move the existing crawl directory to another location. After a discussion in the community, we agreed that it was better to avoid shuffling the crawl directories by allowing re-crawls on the same directory. The change was made and committed.

Nutch users' mailing list has often received mails from users who wanted to know how they can enable support for authentication schemes in Nutch 0.9 by applying the patch in NUTCH-559. Patching Nutch 0.9 was a little cumbersome as the patch was generated against the trunk. With this release, the users can simply download Nutch 1.0 and configure the authentication schemes.

Tuesday, February 03, 2009

AUTH CRAM-MD5

I spent some time last weekend trying to understand the various SMTP authentication mechanisms. I found CRAM-MD5 authentication mechanism particularly interesting as it is a challenge-response authentication and involves HMAC-MD5. Let me first show a complete session with an ESMTP server before coming to CRAM-MD5. In the following session, I have selected the PLAIN authentication mechanism. Whatever I have entered appears in italic font.
susam@nifty:~$ telnet mail.susam.in 25
Trying 64.62.254.117...
Connected to sdclinux2.rdsindia.com.
Escape character is '^]'.
220 sdclinux2.rdsindia.com ESMTP
EHLO
250-sdclinux2.rdsindia.com
250-PIPELINING
250-8BITMIME
250-SIZE 11534336
250 AUTH LOGIN PLAIN CRAM-MD5
AUTH PLAIN AGZvb0BzdXNhbS5pbgBkcm93c3NhcA==
235 ok, go ahead (#2.0.0)
MAIL FROM:<foo@susam.in>
250 ok
RCPT TO:<example.recipient@gmail.com>
250 ok
DATA
354 go ahead
Date: Mon, 2 Feb 2009 10:28:00 +0530
From: Foo <foo@susam.in>
To: Example Recepient <example.recipient@gmail.com>
Subject: Test Mail

This is a test mail.
.

250 ok 1233593978 qp 15929
QUIT
221 sdclinux2.rdsindia.com
Connection closed by foreign host.
susam@nifty:~$
In the AUTH PLAIN line I have sent the base64 encoding of the string \x00foo@susam.in\x00drowssap. \x00 indicates a null character, foo@susam.in is the sender's user name and drowssap is the sender's password. Note that the user name here is in email address format. For most servers, the user name would only be 'foo' instead. If an eavesdropper intercepts this network traffic, he can easily find the user's password by simply decoding the base64 response sent by the client. This is also susceptible to replay attacks as the eavesdropper can use the AUTH PLAIN line containing the base64 encoded credentials, in future, to log into the server.

Now, let me quickly show only the relevant lines for an authentication using the LOGIN mechanism.
250 AUTH LOGIN PLAIN CRAM-MD5
AUTH LOGIN
334 VXNlcm5hbWU6
Zm9vQHN1c2FtLmlu
334 UGFzc3dvcmQ6
ZHJvd3NzYXA=
235 ok, go ahead (#2.0.0)
What is happening here becomes clear by decoding the base64 encoded messages. The following statements in Python 2.5.2 decode these messages.
>>> 'VXNlcm5hbWU6'.decode('base64')
'Username:'
>>> 'Zm9vQHN1c2FtLmlu'.decode('base64')
'foo@susam.in'
>>> 'UGFzc3dvcmQ6'.decode('base64')
'Password:'
>>> 'ZHJvd3NzYXA='.decode('base64')
'drowssap'
LOGIN authentication mechanism is susceptible to the same problems that PLAIN authentication mechanism is susceptible to. Now, let me describe the CRAM-MD5 authentication mechanism. When the client selects the CRAM-MD5 authentication mechanism, the server sends a base64 encoded challenge as shown below.
250 AUTH LOGIN PLAIN CRAM-MD5
AUTH CRAM-MD5
334 PDc0NTYuMTIzMzU5ODUzM0BzZGNsaW51eDIucmRzaW5kaWEuY29tPg==
An HMAC is calculated for this challenge with the password as the key and MD5 function. A string is formed by concatenating the user name, a space and the hex representation of the HMAC. The base64 encoding of this string is sent as the response by the client. The following statements I tried in Python 2.5.2 show how a response can be formed for the above challenge.
>>> 'PDc0NTYuMTIzMzU5ODUzM0BzZGNsaW51eDIucmRzaW5kaWEuY29tPg=='.decode('base64')
'<7456.1233598533@sdclinux2.rdsindia.com>'
>>> import hmac, hashlib
>>> hmac.new('drowssap', '<7456.1233598533@sdclinux2.rdsindia.com>', hashlib.md5).hexdigest()
'667e9fa4470dff3da9d621fe40676722'
>>> 'foo@susam.in 667e9fa4470dff3da9d621fe40676722'.encode('base64')
'Zm9vQHN1c2FtLmluIDY2N2U5ZmE0NDcwZGZmM2RhOWQ2MjFmZTQwNjc2NzIy\n'
Of course, this can be written as a small function:
import hmac, hashlib
def cram_md5_response(username, password, base64challenge):
return (username + ' ' +
hmac.new(password,
base64challenge.decode('base64'),
hashlib.md5).hexdigest()).encode('base64')
The following snippet shows the ESMTP server accepting the client-response.
250 AUTH LOGIN PLAIN CRAM-MD5
AUTH CRAM-MD5
334 PDc0NTYuMTIzMzU5ODUzM0BzZGNsaW51eDIucmRzaW5kaWEuY29tPg==
Zm9vQHN1c2FtLmluIDY2N2U5ZmE0NDcwZGZmM2RhOWQ2MjFmZTQwNjc2NzIy
235 ok, go ahead (#2.0.0)
CRAM-MD5 authentication mechanism is relatively more secure than the other two mechanisms because the password can not be retrieved by decoding the base64 encoded client-response. The password is used as the key to calculate the HMAC but the password is not present anywhere in the response. It prevents replay attacks too because the server sends an unpredictable challenge for every authentication. The client-response computed for a certain challenge is invalid for further authentications which will involve different unpredictable challenges.

References:
  1. RFC 4954 : SMTP Service Extension for Authentication
  2. RFC 4616 : The PLAIN Simple Authentication and Security Layer (SASL) Mechanism
  3. RFC 2195 : IMAP/POP AUTHorize Extension for Simple Challenge/Response
  4. RFC 2104 : HMAC: Keyed-Hashing for Message Authentication

Sunday, September 14, 2008

Combinatorial coincidence

I joined RSA a few weeks ago. Here, I met Anoop who has a passion for combinatorics. We challenge each other with combinatorics problems.

I asked him the following question at lunch.
f0(n) = 1; fk(n) = Σni=1 fk-1(i). Find a closed formula for fk(n).
He asked me one on programming. I have used subscripted variables to represent the counters for the nested loops in the pseudocode to show the question.
count = 0
for c1 in 1 to n:
for c2 in 1 to c1:
for c3 in 1 to c2:
...
...
for ck in 1 to ck-1:
count++
What is the final value in count after the loop exits?
With one question each, we went back to our desks. As, I started solving his question on nested loops, I realized that his problem led me to the recurrence relation in the question I asked him. If k = 1, count = n = f1(n). If k = 2, the inner loop with the counter as c2 will run once when c1 = 1, twice when c1 = 2, and so on. So, the final value of count will be f1(1) + f1(2) + ... + f1(n) = n(n+1)/2. One may note that this is the value of f2(n).

Extending this argument, one may realize that for any k, count = fk-1(1) + fk-1(2) + ... + fk-1(n) = fk(n). In other words, the answer to his question is the answer to my question. I already knew the answer to this. So, I went back to his desk with the answer: C(n+k-1, k).

I had arrived at this closed formula earlier by solving fk(n) for the first few k's using Faulhaber's formula. I noticed that they were all equal to C(n+k-1, k). So, I proved that this is always true by the principle of strong mathematical induction. It involved proving that: fk+1(n) = C(k, k) + C(k+1, k) + ... + C(n+k-1, k) = C(n+k, k+1)

This was a remarkable coincidence that we asked each other questions which had the same answer. When I explained the coincidence to him, he explained how he arrived at the same result for the question on nested-loops which can be used to determine the closed formula for the recurrence relation too.

In the question on nested-loops, the following condition is always met: 1 <= c1 <= c2 <= ... <= ck <= n. So, the number of times count variable would be incremented is equal to the number of possible ways we can arrange k numbers from the first n natural numbers in ascending order. The answer to this is the number of possible ways we can arrange n - 1 similar balls and k similar sticks. If we consider the number of balls to the left of ith stick and add one to it, we get a valid value for ci as the number of balls to the left of a stick can not decrease as we move right and consider (i+1)th, (i+2)th, etc. sticks. Similarly, the number of balls to the left of a stick can not increase as we move left and consider (i-1)th, (i-2)th, etc. sticks. Also, any set of valid values for c1, c2, ..., ck can be represented as an arrangement of these sticks and balls.

We can arrange n-1 similar balls and k similar sticks in (n+k-1)! / {(n-1)! × k!} = C(n+k-1, k) number of ways. This is the answer to the question on nested loops as well as that on the recurrence relation.

Saturday, August 23, 2008

Plotting Graphs

Finally, I realise that I have found the right tool to plot graphs that I can use peacefully without having to worry about anything other than plotting the graph.

It started with my engineering studies when I wanted to show a couple of my friends that an FM wave formed by combining the baseband data signal with a sinusoidal carrier is really frequency modulated where the instantaneous frequency of the FM wave varies as per the baseband data signal. I wrote a program in C and plotted the graph with the help of the graphics library available with Turbo C. But soon, I came across Linux and I found that I couldn't compile those programs using gcc.

Then, I came across Java and learnt drawing stuff using Swing. But it was an involving task. I drew the grid by running a loop that drew Line2D.Double objects on the Graphics object. I plotted the points drawing Ellipse2D. So, I had to do something like the following to plot the dot exactly over a point (x,y).

// Plot the point
int thickness = 4;
Ellipse2D point = new Ellipse2D.Double(x - thickness/2, y - thickness/2,
thickness, thickness);
g2.fill(point);
This thing was inside a loop that calculated y for each x. Of course, I had to do the routine work of getting the screen dimensions, add a JPanel to the ContentPane, etc. This wasn't fun as my main objective was to study the curve.

Then, I came across Matplotlib, a plotting library for Python. Since, I was already familiar with MATLAB, I enjoyed it. The function names and usage were very similar to that of MATLAB commands. It gave me all the powerful things like plot(), axis(), xlabel(), ylabel(), etc. but I missed plot3() a lot. So, I could not plot 3D graphs.

I was already using GNU Octave for quite sometime as an alternative to bc. It helped me to calculate the binomial coefficients quickly using the nchoosek() command. GNU Octave is very similar to MATLAB and I enjoyed it. So, I checked whether it supported the plot() function. It did. I checked plot3(). It worked too. So, I started using this for my plotting work. Last week, I plotted many different families of curves along with their envelopes and I found the task really easy and simple with GNU Octave. Let me show its simplicity with an example. I wanted to plot the family of lines given by the function, y(x) = 2cx - c2 along with its envelope. See the code and its simplicity:

% Presentation
axis([-5, 5, 25, -10]);
grid("on")
hold on

% Plot the family of curves: y(x) = 2cx - c^2
x = [-5:0.1:5];
for c = [-10:0.5:10]
plot(x, 2 * c * x - c^2, 'b-')
endfor

% Plot its envelope: y(x) = x^2
plot(x, x .^ 2, 'r-');
I think I'm going to use this for a long time.