Win a copy of The Little Book of Impediments (e-book only) this week in the Agile and Other Processes forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Pig Script are creating empty files in HDFS

 
João Souza
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi experts,

I've this statement:

--Insert a new column based on filename
Data = LOAD '/user/cloudera/Source_Data' using PigStorage('\t','-tagFile');

Data_Schema = FOREACH Data GENERATE
(chararray)$1 AS Date,
(chararray)$2 AS ID,
(chararray)$3 AS Interval,
(chararray)$4 AS Code,
(chararray)$5 AS S_In,
(chararray)$6 AS S_Out,
(chararray)$7 AS C_In,
(chararray)$8 AS C_Out,
(chararray)$9 AS Traffic;

--Split into different directories
SPLIT Data_Schema INTO Src1 IF (Date == '2016-06-25.txt'),
Src2 IF (Date == '2014-07-31.txt'),
Src3 IF (Date == '2016-01-01.txt');

STORE Src1 INTO '/user/cloudera/Source_DatA/2016-06-25' using PigStorage('\t');
STORE Src2 INTO '/user/cloudera/Source_Data/2014-07-31.txt' using PigStorage('\t');
STORE Src2 INTO '/user/cloudera/Source_Data/2016-01-01' using PigStorage('\t');

And there is a example of my orignally source data:

10000  1388530800000  39  8.600870350350515  13.86183926855984  1.7218329193014124  3.424444103320796  25.972920214509095

But when I execute it runs successfully, however the files in HDFS are without data...

Note that I add a new column based on filename. That's why I've one more column in Foreach Statment...
 
João Souza
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Error
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic