Win a copy of Succeeding with AI this week in the Artificial Intelligence and Machine Learning forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Liutauras Vilda
  • Junilu Lacar
Sheriffs:
  • Tim Cooke
  • Jeanne Boyarsky
  • Knute Snortum
Saloon Keepers:
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Carey Brown
  • Piet Souris
Bartenders:
  • salvin francis
  • fred rosenberger
  • Frits Walraven

Pandas utilizing CSV in Docker

 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am new to Pandas and Docker. I am trying to play around with a simple application. Container 1 needs to run DataGen.py script which generates random numbers every 10 seconds, 10 times. These random numbers need to be stored in Book2.CSV file inside a volume. Then Container 2 needs to run DataCalc.py script which pulls data from Book2.CSV in that particular volume and gives an output every 10 seconds, 10 times. I ran this application manually and it runs perfectly. But, when I run this application inside Docker, the last step gives me an error. See below for details.

Thank you.

**DataGen.py**

   import pandas as pd
   import numpy as np
   from csv import writer
   import time
   
   for i in range(10):
       num = np.random.normal()
       with open('Book2.csv' , 'a+', newline='') as fd:
           csv_writer = writer(fd)
           csv_writer.writerow([num])
           print(num)
       fd.close()
       time.sleep(10)

**DataGen.Dockerfile**

   FROM python
   
   RUN pip install pandas
   
   WORKDIR /mydata
   
   COPY DataGen.py ./
   
   CMD python DataGen.py

**DataCalc.py**

   import pandas as pd
   import numpy as np
   from csv import reader
   import time
   
   for i in range(10):
       df = pd.read_csv(r'/data/Book2.csv')
       df_mean = np.mean(df)
       df_std = np.std(df)
       df_median = np.median(df)
       print(df.tail(1),df_mean[0], df_std[0], df_median)
       time.sleep(10)

**DataCalc.Dockerfile**

   FROM python
   
   RUN pip install pandas
   
   WORKDIR /mydata2
   
   COPY DataCalc.py ./
   
   CMD python DataCalc.py

I performed the following steps:

1. `sudo docker build . -f DataGen.Dockerfile`
2. `sudo docker run --name pytest -v ${PWD}:/data 98`
3. Output I got:

>     -0.5461521300011364
>     -0.28622940021542786
>     -0.49742451640743496
>     -1.25765977303578
>     -1.0777765407386313
>     0.4049991325850138
>     -2.102117991682628
>     1.7556065421001426
>     0.21521175249561658
>     1.34793178068966

4. `sudo docker build . -f DataCalc.Dockerfile`
5. `sudo docker run --name pytest2 -v ${PWD}:/data fc`

However after the last step I am getting the following **error**:

   Traceback (most recent call last):
     File "DataCalc.py", line 13, in <module>
       df = pd.read_csv(r'/data/Book2.csv')
     File "/usr/local/lib/python3.8/site-packages/pandas/io/parsers.py", line 676, in parser_f
       return _read(filepath_or_buffer, kwds)
     File "/usr/local/lib/python3.8/site-packages/pandas/io/parsers.py", line 448, in _read
       parser = TextFileReader(fp_or_buf, **kwds)
     File "/usr/local/lib/python3.8/site-packages/pandas/io/parsers.py", line 880, in __init__
       self._make_engine(self.engine)
     File "/usr/local/lib/python3.8/site-packages/pandas/io/parsers.py", line 1114, in _make_engine
       self._engine = CParserWrapper(self.f, **self.options)
     File "/usr/local/lib/python3.8/site-packages/pandas/io/parsers.py", line 1891, in __init__
       self._reader = parsers.TextReader(src, **kwds)
     File "pandas/_libs/parsers.pyx", line 374, in pandas._libs.parsers.TextReader.__cinit__
     File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
   FileNotFoundError: [Errno 2] File /data/Book2.csv does not exist: '/data/Book2.csv'
 
Power corrupts. Absolute power xxxxxxxxxxxxxxxx is kinda neat.
Two software engineers solve most of the world's problems in one K&R sized book
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
    Bookmark Topic Watch Topic
  • New Topic