• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Rob Spoor
  • Tim Cooke
  • Junilu Lacar
Sheriffs:
  • Henry Wong
  • Liutauras Vilda
  • Jeanne Boyarsky
Saloon Keepers:
  • Jesse Silverman
  • Tim Holloway
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Mikalai Zaikin
  • Piet Souris

I need some help with Huffman encoding...

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
OK, so one of my teachers from college just made me do a file compressing utility, using the Huffman encoding. I don't have much problem with the code tree, since I've been reading quite a bit about data structures in Java. The problem, however, is with the encoding process itself.

The thing is, that Huffman encoding creates code words of absolutely any length, from 1 bit to whatever the **** you get. But guess what? A computer can only handle full bytes. That's a big problem for me, 'cause I have absolutely no idea on how to handle these "partial bytes", and I don't think padding with zeroes would do, because that would make my program much, much less efficient. I've been looking on the Internets a way to read and handle "partial bytes" with Java, partly because I'm really good at Java, partly because I know absolutely nothing about C++, but so far I've found nothing at all.

Any idea on how can I keep my Huffman encoding efficient?
 
Rancher
Posts: 43026
76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to JavaRanch.

Is performance really that important in this assignment? I would think that correctness is foremost.

One way around this might be the java.util.BitSet class, which lets you handle "bit strings" of any length.
 
author
Posts: 23907
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You may want to ask your professor on what you are allowed to use.

If I remember correctly, way back in college, when I did this assignment, I was expected to use the bit operators to mask and shift the bits into place. I did it in C back then -- I wanted to do it in assembly (for my Commodor 64), but my professor insisted that it ran on the school's DEC system.

Anyway, Java has all of the bit operators from the C language, so you should check if this is what you are expected to use.

Henry
 
Rancher
Posts: 4686
7
Mac OS X VI Editor Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Acoyani Garrido Sandoval:
But guess what? A computer can only handle full bytes. That's a big problem for me, 'cause I have absolutely no idea on how to handle these "partial bytes", and I don't think padding with zeroes would do, because that would make my program much, much less efficient.



As others have said, why worry about efficiency.

Sounds like a good assignment. One thing it will teach you is that its only modern computers driven from a particular set of engineering assumptions that even have a concept of a 'byte' which seems to be universally defined as 8 bits.

Some of the greatest old computers were word oriented, and could pick up and write any number of bits anywhere in the word (which was often 36 bits). So it is good to break out of the mindset that a byte is an important concept, and that all bytes contain 8 bits.

Most compression systems use heavy packing, not only Huffman, but LZW, and similar approaches.

You can always use shift and XOR/AND/OR operators to do anything.
 
reply
    Bookmark Topic Watch Topic
  • New Topic