# Understanding and building a social network algorithm

sumit anand kumar
Ranch Hand
Posts: 83
I am not sure whether this is the right platform to ask this question.
But my problem statement is : I have a book shop & x no of clients (x is huge).
A client can tell me whether a book is a good or bad (not recommended).
I have a internal logic to club books together , so if a client says a book is bad, he is saying that similar books are bad too and don't show him that.
I oblige and hide those books. Clients can also interact among themselves, and have a mutual confidence level between them.
A case arises when client A says Book X1 is bad. Hence i blacklist X1,X2,X3,X4 etc.
But his friend client B says X3 is good. So now i have to show X3 to A.
I was thinking to build a social network of all my clients based on their interaction, and be able to calculate their mutual confidence level.
So in the above senario if mutual confidence level is very high will will show X3 to A, or else i won't show X3 to A.
I wanted to get myself kickstarted on building the social network and assigning a wt. to a path between 2 nodes (my clients). Please suggest me some good pointers where i can start.
Any book, websites etc.

I will use hadoop in my implementation

Martin Vajsar
Sheriff
Posts: 3752
62
I cannot comment on the algorithm, but one thing struck me out: how big is X going to be? Given today's prices of storage and memory, I'd say (very generally) that anything short of billion records should be doable undistributed. Distribution complicates things considerably. Unless you actually want to learn/practice Hadoop, I'd say you should establish much more firmly whether you do need the distributed thing.

sumit anand kumar
Ranch Hand
Posts: 83
Martin Vajsar wrote:I cannot comment on the algorithm, but one thing struck me out: how big is X going to be? Given today's prices of storage and memory, I'd say (very generally) that anything short of billion records should be doable undistributed. Distribution complicates things considerably. Unless you actually want to learn/practice Hadoop, I'd say you should establish much more firmly whether you do need the distributed thing.

Yes Martin, i will be using hadoop for this. but problem is not that x (my clients) is going to be in order of few 100K, but that X(their actions) can be of high frequency.
Think of twitter /linkedin. The problem is not creating the network, the problem is computing the big 'X' to find the trust level in the network.

Steve Fahlbusch
Bartender
Posts: 605
7
for such an app i would use a rete (net) algorithm to determine the closeness (or degree of trust)