• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Count number of words in ascending order

 
Ranch Hand
Posts: 76
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi Guys,

i am using this code to get all job offers for "Business Analyst":



the issue is that i want to count occurences of word and sort them after in asc order, like:

python (50)
aws (49)
data(33)

The code is not working somehow:



thank you for help.

And additionally what i can improve in this code?

Best,
Jacek
 
Marshal
Posts: 79475
379
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Jacek jaryszek wrote:. . . sort them after in asc order, like:

python (50)
aws (49)
data(33)

. . .

Please explain; are those the words before sorting? What does ascending order mean?
 
Jacek jaryszek
Ranch Hand
Posts: 76
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Campbell !

I mean i want to count number of occurences of words.
So these words with weight they are sorted from the biggest number of occurences.
If word "Home" appeared 3 times i just want to get in in a output.

for example i have sentence: "I love working remote because remote allow me to be at home. Being at home can be really exciting. At home i can do also house duties".

So in this sentence there is:
weight word
3 home
2 can
2 can
2 home
2 remote
2 remote
1 allow
1 allow
1 also
1 also
1 because
1 because
1 being
1 being
1 duties
1 duties
1 exciting
1 exciting
1 house
1 house
1 love
1 love
1 really
1 really
1 working
1 working

this is what i am trying to achive.

Best,
Jacek
 
Marshal
Posts: 8880
638
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Jacek jaryszek wrote:The code is not working somehow


In which way?
 
Jacek jaryszek
Ranch Hand
Posts: 76
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Liutauras,

i uploaded image:

webpage

so generally it is outputting one big stream and writing (1).

All gather jobs text....
....text...text...text..
(1)

this is the output
 
Liutauras Vilda
Marshal
Posts: 8880
638
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Try to debug by printing out some content from earlier, i.e. what your collect holds?

The output suggests that you might have just one element in it, please check what's there. Try to locate the place where issue starts.
 
Jacek jaryszek
Ranch Hand
Posts: 76
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So generally i did this a lot of times

in this code:



it somehow decreasing hashmap from AllJobOffers (i have about 80 at start) and after this step i have barealy 35.
Why? I have no idea.

Thank you for your help,
Jacek

 
Liutauras Vilda
Marshal
Posts: 8880
638
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, the numbers seem to be ok to me.

In AllJobOffers you have 81 elements, where each sort of represent a paragraphs (fairly big) of text.

Then you do grouping by, and so you get 35 keys (big chunks of text) mapped to a number of occurrences in the AllJobOffers.

Those counts look as such:
So you if you'd add those up, you'd get the same 85.

And then you are printing  everything. Output looks as it looks, so you don't count words, but rather paragraphs, because you don't disassemble text. One of the examples what you grouped by is this chunk of text (you got it returned 10 times it seems from different API calls) :

a global bank are recruiting for a data scientist to join their global team. the role is a hybrid between data engineer and data scientist as it will be focused on supporting machine learning models, re-training the model with data, and removing biases.  we’ll look to you to demonstrate technical and people leadership to drive value for the customer through data science platform engineering using cloud technologies. you’ll be working closely with core technology and architecture teams to deliver strategic data solutions, while driving agile and devops adoption in the delivery of data engineering, leading a team of data engineers.
the core skills are around data science and cloud, with focus on mlops and model controls. the cloud provider of choice for this team is aws, so skills in aws data are also key.  
requirements:

expertise in machine learning
background in software engineering.
strong experience in python/pyspark.
experience on data science implementations and with awareness of the data science development methodologies.
exposure/awareness of big data technologies and aws data science tools like sagemaker , emr clusters would be highly appreciated.
knowledge of data engineering tools like glue, kafka, scala, ides like visual studio/pycharm/ intellij , nosql like mongodb, can only help.



I'd suggest you start decomposing your program to a smaller chunks of code, i.e. smaller methods (or even classes) and start writing unit tests for them.
At the moment is a bit messy.

So you could create a class for example a WebCrawler or similar name, which would call an API and get the response (text) by the given keyword(s). It could contain maybe some methods to parse that text, i.e. make a collection of keywords used in the text, not full paragraphs like you have now.

Then another class with some related methods, which could analyse given collection(s) of keywords, i.e. group by, return most recurring word and similar...

That way you could a bit better organise your program, so you'd find it easier to assemble the puzzle into one.
 
Liutauras Vilda
Marshal
Posts: 8880
638
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Beware, I've moved your topic to a Java in General forum, which is reserved for more advanced problems than the Beginning Java forum.

I think this program requires a bit more advanced skills than Beginners are expected to know as it involves calling a REST API, then parsing its response and similar..
 
Jacek jaryszek
Ranch Hand
Posts: 76
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you so much Liutauras!

Awesome help from your side!

I am totally beginner so this is why i put the question into Java Beginner Category.

Ok,
i managed to wrote the code which is working and hope will help some other people like me.



thank you very much for classes tip.
I am swithing from VBA (where OOP is very limited and i am rarely use it) into Java and i do not have as habit to add classes for each chunk of code.
But let me try ti do this and post here to make this beautiful and nice!

Best,
Jacek
 
Saloon Keeper
Posts: 10815
86
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I was trying to get the histogram sorted by value-reversed, then subsorted by key-natural
Before I put in ".thenComparing(...)" it was ok and now I can't figure out what the compiler wants from me.


**** Put this in it's own THREAD
**** Solution found, see thread link above
 
Jacek jaryszek
Ranch Hand
Posts: 76
1
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Guys,

i managed to put code in 3 classes:

Main:


1. Question: the method countFrequencies should be also in seperated class or not? What is the best practice for it?

UrlReader class:


2. What to improve here?

TextParser class:


3. What is to change here?

What do you think abut this code? What to improve? What are the best pracitices to include?

4. In the next step i want to add Swing App with listeners. What will be a good practice to continue?

Best Wishes,
Jacek
 
Carey Brown
Saloon Keeper
Posts: 10815
86
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Your indentation is pretty good but you missed a place or two that should have been indented.

It looks like you are opening several resources but I don't see any close() calls.

Your regular expressions need work, I hope you are thoroughly testing them.


Also note that
\\--?
is the range of characters from '-' to '?' inclusive, not '--'.

When you want a '-' in a character set is should appear as the first or last character in the set.


Note that *? is a greedy qualifier.
 
Liutauras Vilda
Marshal
Posts: 8880
638
Mac OS X VI Editor BSD Java
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Jacek, just assigned you a cow for great efforts reworking all this - well deserved
 
Jacek jaryszek
Ranch Hand
Posts: 76
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you Guys!

O wow it is nice.
And for me the biggest pleasure is that i made my first app, created classes and you are not saying that code sucks!

But if you see something more -> please let me know

best,
Jacek
 
reply
    Bookmark Topic Watch Topic
  • New Topic