• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Tim Cooke
  • Jeanne Boyarsky
  • Liutauras Vilda
Sheriffs:
  • Frank Carver
  • Henry Wong
  • Ron McLeod
Saloon Keepers:
  • Tim Moores
  • Frits Walraven
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Piet Souris
  • Himai Minh

Huffman Tree - Data Structures

 
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Dear Java Community

I kindly want to inquire, if I briefly described a couple of efficient data structures and how they implement a Huffman Tree correctly they are the following:

1- Priority queue

The keys of the priority queue can be the frequencies and the letters used can be the values. The letters will be retrieved from highest to lowest frequencies.

2- Min Heap

Since, building a Huffman tree is bottom-up a min heap can be used to store the values and their encoding, and retrieve the smallest letter frequency easily.

3- HashMap

This data structure is efficient in accessing items in constant time. The keys can be the letters frequencies and values are the letters.

Any help is much appreciated. Thank you guys for helping me in my previous posts much appreciated.
 
Marshal
Posts: 76408
365
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

T Sukdeo wrote:. . . The keys of the priority queue can be the frequencies and the letters used can be the values. . . .

What keys? Queues don't have keys and values. Maps do, but please explain how you would use a Map to implement a Huffman tree.
Please also explain how a min heap would help.
 
T Sukdeo
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
My sincere mistake. Will do.
 
Campbell Ritchie
Marshal
Posts: 76408
365
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Don't edit old posts; post a new post with the explanation.
 
Saloon Keeper
Posts: 14266
321
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Note that a PriorityQueue is usually implemented using a Heap, so your first two options may not actually be distinct.
 
Ranch Hand
Posts: 125
12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I implemented a Huffman tree just a few weeks ago, so fresh off my own experience (which may different than yours), may I suggest a good old-fashioned array?

Before I start, I will point out that for my application, I had a reasonably small symbol set (256 symbols... many implementations add a special symbol for "end-of-text").  If yours is much larger, my suggestion might be less effective.  Also, I am assuming you are thinking of a classic Huffman tree rather than an adaptive tree (such as Vitter's algorithm).

From the perspective of a tree, we recall that there are two kinds of nodes:  branches and leafs.     I used the same class for both, though I certainly could have used two different classes.   The node class had the rough form:



I declared two arrays.  The first was an array of leaf-node objects, one for each symbol.  This array was used for counting and coding symbols.   The second was an array of the same objects (e.g. it was a shallow copy of the first).  This array was used for sorting leafs by their count and building the Huffman tree.

I looped through my input text symbol-by-symbol.  Since my text consisted of bytes, the byte value could be used to map to an array index.  Using this index to access the proper leaf node from the primary array, I incremented its count.  When I was finished counting symbols, I sorted the second array in ascending order of counts (again, it contained the same node objects as the primary...  it just put them in a different order, the primary array was unchanged).  I removed zero-count elements. Then I applied Huffman's classic algorithm to build up the tree a pair of nodes at a time.  At this point, the second (sorted) array could be discarded).

To encode the text, I again made a pass through the symbols using their byte value as an index into the master array of leaf nodes.  Once I had the leaf, I traversed up the tree from bottom to top.  Of course, that direction of traversal was just the opposite of the bit order for the Huffman code (which goes from top to bottom/root to leaf), so I needed a good way to reverse them.    This might be where using an interesting collection class might come into play.  I used a Java Deque, treating it as a classic stack structure.  As I traversed up the tree (from child to parent), I put each node on the stack.  Once I reached the root node, I simply popped the stack to get back the nodes in the proper order of encoding.


For decoding the Huffman message, I just used the tree structure directly.  It was very efficient.  Sometimes, rolling your own data structure is the best way to go.

Hope this helps.

Gary


 
You’ll find me in my office. I’ll probably be drinking. And reading this tiny ad.
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic