### Assignment #4: Hashtables

Due: Mar 24: before 11:59:59 PM

In this assignment you will build two versions of a very basic hashtable.

Your hashtable should have the following properties:

• It should use strings as the key and it should store non-zero integers. The integers must be stored as integers.
• The string used for keys should only contain letters a..z, A..Z
• The string used for keys should be 8 characters or less.
• The string used for a key may be downcased before hashing. (This is not required but is a good idea.)
• The hashtable should use linear probing
• The hashtable must implement tombstoning. (Given the above requirements, a useful tombstone might be "!!".)
• The hashing function may be simple polynomial accumulator. Suppose that the key is stored in char key[]. Then the accumulator could look something like:
```        res = (key[0]-'a')+(key[1]-'a')*13+...+(key[n]-'a')*13**n
```
(** inidcates 13 raised to the nth power; this is not an operator in C). (You will probably want to use a long integer in your hashing function.) There are better, more complex, hashing algorithms. This one is sufficient.
• The size of the hashtable should be provided as command line parameter. That is, no rehashing, no growth, no shrinkage. Ideally the size of the hashtable should be a prime, but this need not be enforced by the program.
• The hashtable should only refuse new items when it is completely full.
• The program should accept input through a pipe. The input should be of the following form:
```            put aaa 6
put bbbbcd 12
get aaa 5
rem aaa 5
get aaa 7
```
where:
1. Every line contains exactly 3 items. The first item is a string that must be either: "get", "put", or "rem". The second item is the key. The third item is the value. For get and rem the value is ignored, it just makes parsing easier if there are always 3 items on a line.
2. put adds items to the hashtable. On every put, the program should print either: 0 if there is no room in hashtable; -1 is the key is invalid; or two integers. The first is the hash value of the key, the second is the actual location in the hashtable at which the value is stored.
3. Get gets the value in the hashtable associated with the key. If the key is not in the hashtable return -1.
4. rem removes the key from the hashtable. On removal, a tombstone should be added to the hashtable
• If the commands to the hashtable are in a file named htcommands.txt, the compiled executable is named hhtt and you want to create a hashtable of size 97, then the command line to run your hashtable program should be
```        hhtt 97 < htcommands.txt
```
• Hint: you will need two arrays to implement your hashtable, one storing keys and one storing values.
For example: here is an input file:
```    put dfsdfs 7
put dklsjfds 20
put d 234
put ff 345
put ee 456
put ff 567
put gg 678
put hh 789
put ii 890
put jj 990
put kk 901
put mm 012
get ee 1
get ff 1
get mm 1
get dfsdfs 1
rem ff 1
rem kk 1
get mm
put ff 1024
put gg 2048
put zzz 4096
```
and here is a trace output from that input. Note that I made a very small hashtable and got it quite full. This was intentional as only be being pretty full can you really test probing. (You do not need to have the same output, and indeed I encourage you to innovate as there are many ways in the output could be clearer.)
```    [gtowell@powerpuff HW4a]\$ cat inp.txt | ht 13
PUT 8  8
PUT 8  9
PUT 6  6
PUT 8  10
PUT 7  7
PUT 8  10
PUT 9  11
PUT 10  12
PUT 11  0
PUT 12  1
PUT 0  2
PUT 2  3
GET ee  456
GET ff  567
GET mm  12
GET dfsdfs  7
REM ff  1
REM kk  1
GET mm  12
PUT 8  10
PUT 9  11
PUT 2  2
The Hashtable
=============
0                    ii   890
1                    jj   990
2                   zzz  4096
3                    mm    12
4                           0
5                           0
6                     d   234
7                    ee   456
8                dfsdfs     7
9              dklsjfds    20
10                    ff  1024
11                    gg  2048
12                    hh   789

```

#### Reading the input file

I suggest that you use fgets to read stdin line by line then strtok to separate items on the line (use a space as the splitting character). However, that is not required.

#### Part 1: 70%

In the first part you will implement the hashtable using standard array notation. You may use pointers in dealing with strings. Otherwise you should use array notation. Get this completely working before going on to part 2

#### Part 2: 30%

In the second part you should use only pointers and pointer math. Therefore in the second part, the only use of [] should be in the lines that actually create arrays.

### What to submit:

Collect together in a single directory all of the following:
• README file. This is a file literally named README. This file should follow the format of ../README.txt Your readme file should clearly state how to compile and run you program. Also, the standard reflection on what went well or poorly.
• All program files for this task. If you use a makefile for compilation, include that
• A test file of your creation.
• A script showing the use of your code on your test file.
• Do NOT include compiled code.
When you have everything ready in this directory, then go to the directory containing the one with the information you want to submit. For instance, suppose that the files specified above are in /home/YOU/cs246/a4. Then you should go to /home/YOU/cs246.
Use the submit program as follows:
```  /home/gtowell/bin/submit -c 246 -p 4 -d a4
```
This will package up everything in the directory a4 and submit it under your login name along with a timestamp of your submission date.

Back to CMSC246 Home page