Win a copy of The Little Book of Impediments (e-book only) this week in the Agile and Other Processes forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Problem: Couldn't find file within Map function

 
Arwa Saad
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,
I'm creating a sentiment app using SentiStrength tool.
The problem is that this tool look up for file for comparing purposes. I have placed all the file this tool need in HDFS.
When I try to open it from the map function, it says that it couldn't find the file.

This is the code


As you can see, I have to place the path to this file. I'm sure it is in HDFS but why is can't find it?
 
Karthik Shiraly
Bartender
Posts: 1210
25
Android C++ Java Linux PHP Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A HDFS URL is not like a regular filesystem path.
A regular file path like "c:\data\myfile.txt" or "/var/lib/myfile.txt" can be read or written using file I/O APIs, because the OS's filesystem layer knows how to read/write them.
But a HDFS URL is understood only by the HDFS daemons; it is not recognized as a file by the OS's filesystem layer.

One simple solution is your mapper should copy whatever files are required by sentistrength from HDFS onto the node's local filesystem, and then pass those local filesystem paths to sentistrength.
You can use FileSystem.copyToLocalFile to do this.

An optimized way of doing the same thing is add those files required by sentistrength to the Job as cache files:

When mapper is executed, every node downloads this file automatically under the name "EmoticonLookupTable.txt" (ie, whatever name follows the #)
Then use it in mapper like any local file:
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic