Trie is a special kind of a tree (not necessarily binary, although it could be) that, given a finite set of alphabet, stores the words of its language in a very efficient way. It first begin by an empty node and then adds new nodes to this initial node. The maximum child nodes size is the size of alphabet and the maximum level of the tree is the length of the longest word in the language. For example, if the alphabet contains 26 letters and the longest word in it is of size 3, the Trie will have 3 levels with each of its nodes has at maximum 26 child nodes. This gives us a total of 26*26*26 nodes to store all possible combinations of 3 letters from letter0 to letter25. This is very efficient as there is no duplication in the Trie. If the language has these two words "for" & "form", they won't take extra space. Instead of storing them as a separate words, they will share all the common letters, and then a flag or something will be set to 1 when we reach the 'r' to indicate an end of a word. Then continue the word "form" till you reach the 'm' and set the flag to be 1 to indicate an end of another possible word. What if we want to add another word like "format", we will just add another child 'a' to that 'm' node we obtained in the previous example and add a new child 't' to it and set its flag to be 1 and so on.
Why is Trie so efficient ??!!
Imagine in the previous example that we have a language that has 3 words "for", "form" and "format", a typical way to store these words is to build a hash table or an array of strings or a set or any other data structure that will store each of these words in a separate place. There will be 3 strings of lengths 3, 4 and 6 which will sum up to 13 byte memory space. The concept of a Trie is not to "repeat" already existed nodes in the tree. There can't be 2 children of one parent having the same value as we saw in the example. So, total used memory for the words above is actually 6 bytes which is the length of the longest of them (of course we can use extra memory indicating end of words and so on). If we would like to build a dictionary for a complete language with all its words, we will find so many common prefixes for the words. Using a Trie in this case will save a lot of memory and a lot of time when searching.
Another example to make you feel the importance of using Trie, consider a language of only one alphabet which is letter 'a'. Words of this language can have length of at most 1000 letters (all 'a's of course). A Trie can represent such a language using 1000 nodes --> 1000c byte (c is the size of the node which is about 8 bytes in this case, 4 bytes to point to the node child and another 4 bytes to hold how many times this word is present in the dictionary or 0 if it's not a word). Any of the other possible ways that will represent each word on its own will need 1 + 2 + 3 + ... + 1000 bytes --> (1000 * 1001) / 2 bytes = 500500 bytes. Using simple math, 8000 / 500500 = 0.016 which means that Tries use 0.016 of the space used in a hash table or an array.
Tries also sorts the words in it & it actually doesn't need to do extra work to do so. If the alphabet is {a, b} letters then you make sure that when you try to add a word beginning with 'a' that you put it in the first child node and when you try to add a word beginning with b you put it in the second child node and so on. Hence, when searching the Trie after adding all required words (in no particular order) the search results will explore the words in their order without having to spend extra time sorting them.
For more information about Tries, you can visit Wikipedia, or find a c++ implementation for a Trie here. It already has the implementation of adding and deleting from the Trie. You can try to do the Search function as a practice, and here is a couple of problems to exercise.
902 - Password Search (You will never be able to solve this problem without tries :D. Anything else will get you Time Limit Exceeded)
10999 - Crabbles (Never tried to do it without Tries actually. Don't know if it can be solved in any other way)
There is no wonder they always call it Data Structure & Algorithms as the development of some very efficient algorithms totally depend on how the data is structured in memory.
Sunday, April 20, 2008
Friday, April 18, 2008
Stack vs Recursion
When reading about stacks or recursion, I found lots of people saying that recursion can be simulated using stacks. I knew this was true but never actually tried to convert a recursive code to one using stacks before, well, until today. I found this problem talking about a language that has some kind of regular expression, they gave you the language rules and you them will be given a sentence to check if it's valid or not. First of all, I wrote a recursive code. It was the first thing I thought of, then, having received Wrong Answer for that, I decided to code it again from scratch. I then had the idea of making it using stacks as a replacement of the recursive function. The main idea was something like: having a string of letters push them all in the stack until you find a combination of letters that can be replaced by other simpler combination of letters, so replace them and continue doing this process until you can't make it any simpler. The code looks something like:
//s & s2 are stacks, str is a string
s = Sentence(str);
temp = ToString(s);
s2 = Sentence(temp);
while(s != s2)//can't make the sentence any simpler
{
s = s2;
temp = ToString(s);
s2 = Sentence(temp);
}
The code in Sentence() iterates on the string str passed to it (and oush it in the stack) and whenever it encounters a combination that can get simpler it just pop it from the stack and push the simpler equivalent one. Then back to the main(), I convert it to string to process on it again. I just discovered that I can do so without using a stack and by just using a string. It will save the time of ToString() function, but for me, it's more understandable this way.
That's if for this time. Hope I come back latter with something useful. Here are the problem links in the old and new servers, and in case they just invented a new server it's problem number 271 :D
//s & s2 are stacks, str is a string
s = Sentence(str);
temp = ToString(s);
s2 = Sentence(temp);
while(s != s2)//can't make the sentence any simpler
{
s = s2;
temp = ToString(s);
s2 = Sentence(temp);
}
The code in Sentence() iterates on the string str passed to it (and oush it in the stack) and whenever it encounters a combination that can get simpler it just pop it from the stack and push the simpler equivalent one. Then back to the main(), I convert it to string to process on it again. I just discovered that I can do so without using a stack and by just using a string. It will save the time of ToString() function, but for me, it's more understandable this way.
That's if for this time. Hope I come back latter with something useful. Here are the problem links in the old and new servers, and in case they just invented a new server it's problem number 271 :D
Hello World !!
Hello whoever reads this right now. It's me again, the one who is known among my friends to be addicted to blogging :D. Here I am opening a new blog just to talk about ACM stuff (I think you figured this out from the blog url :D). It will most likely be something like discussing problems (both problems I solve or write), techniques and maybe contests (in a professional way, not in the way I use to write about contests in my personal blog. At least that's what I hope for :D).
So now, for those who completely has no idea who I could be :D. Well, my name is Asmaa Magdi (can be changed to ACMaa which is the most nickname that suites me :D), a muslim egyptian computer science girl who just loves problem solving :D. That's pretty much what you are gonna need to know about me to follow up my posts. Ah, almost forgot to add, I'm now coaching a group of younger students in my college (Faculty of Computer & Information Science 'FCIS' Ain Shams University 'ASU' which is in Cairo, Egypt :D). I dream about making a difference in our ACM training in FCIS. I dream about a/an FCISian team advancing to ACM World Finals. I dream about a whole new generation of FCISian ACMers who is used to go to World Finals and finds it a piece of cake. I dream about building a new reputation for our dear FCIS in ACM community that "We can do it". And finally, I dream to be one of those who make our college proud by advancing to World Finals for the first time and hopefully not the last.
Not that those dreams has anything to do with what I'm gonna discuss here later, but I just wanted to state them. I will now leave you to read other posts without bothering you so much with useless talk (I know it, I'm so talkative :D).
So now, for those who completely has no idea who I could be :D. Well, my name is Asmaa Magdi (can be changed to ACMaa which is the most nickname that suites me :D), a muslim egyptian computer science girl who just loves problem solving :D. That's pretty much what you are gonna need to know about me to follow up my posts. Ah, almost forgot to add, I'm now coaching a group of younger students in my college (Faculty of Computer & Information Science 'FCIS' Ain Shams University 'ASU' which is in Cairo, Egypt :D). I dream about making a difference in our ACM training in FCIS. I dream about a/an FCISian team advancing to ACM World Finals. I dream about a whole new generation of FCISian ACMers who is used to go to World Finals and finds it a piece of cake. I dream about building a new reputation for our dear FCIS in ACM community that "We can do it". And finally, I dream to be one of those who make our college proud by advancing to World Finals for the first time and hopefully not the last.
Not that those dreams has anything to do with what I'm gonna discuss here later, but I just wanted to state them. I will now leave you to read other posts without bothering you so much with useless talk (I know it, I'm so talkative :D).
Subscribe to:
Posts (Atom)