zb cong

Ranch Hand

Posts: 416

posted 13 years ago

hello

1. A cost-map contains M*N cells, we have the cost rate for each individual cell.

2. The path is defined as a sequence of cells that connect to each other directly.

3. If two cells are in the same row or column, and there is no cell between them, these two cells are defined as directly connected.

4. The cost of a path is the summary of all the cells’ costs involved.

question:

you are required to find out a cheapest path from one specified cell to another.

i have designed several class architecture as follow,maybe they are not complete,but it is not important,what i more care about is:

thank you!

**********************************************************************

A: is the row number of the map

B: is the column number of the map

getCell returns the Cell with row and column specified. The start index is 0.

getStartCell and GetTargetCell return the start and end cell of the problem.

getTraceLength returns how many cells are involved in this trace.

getCell returns the number n cell in this trace.

getCost returns the total cost of this trace.

1. A cost-map contains M*N cells, we have the cost rate for each individual cell.

2. The path is defined as a sequence of cells that connect to each other directly.

3. If two cells are in the same row or column, and there is no cell between them, these two cells are defined as directly connected.

4. The cost of a path is the summary of all the cells’ costs involved.

question:

you are required to find out a cheapest path from one specified cell to another.

i have designed several class architecture as follow,maybe they are not complete,but it is not important,what i more care about is:

**how can i implement the algrithm to find the cheapest path?**,who can give me the code snippet about the algrithm to get the cheapest path?or where can i find it?thank you!

**********************************************************************

public class Map{

int A;

int B;

public Cell getCell(int row, int col){

}

public Cell getStartCell(){

}

public Cell getTargetCell(){

}

}

public class Map{

int A;

int B;

public Cell getCell(int row, int col){

}

public Cell getStartCell(){

}

public Cell getTargetCell(){

}

}

A: is the row number of the map

B: is the column number of the map

getCell returns the Cell with row and column specified. The start index is 0.

getStartCell and GetTargetCell return the start and end cell of the problem.

public class Cell{

int row;

int col;

public int getCost(){

}

}

public class Cell{

int row;

int col;

public int getCost(){

}

}

public class Trace{

public int getTraceLength(){}

public Cell getCell(int n){}

public int getCost(){}

}

public class Trace{

public int getTraceLength(){}

public Cell getCell(int n){}

public int getCost(){}

}

getTraceLength returns how many cells are involved in this trace.

getCell returns the number n cell in this trace.

getCost returns the total cost of this trace.

jeroen riezebeek

Greenhorn

Posts: 9

Sonny Pondrom

Ranch Hand

Posts: 128

zb cong

Ranch Hand

Posts: 416

zb cong

Ranch Hand

Posts: 416

Eddie Vanda

Ranch Hand

Posts: 283

posted 13 years ago

Hi zb,

Sounds to me like the traveling salesman problem (TSP). Instead of various distances you have various costs of each node. Plenty of stuff on that on Google - it's not a trivial problem. The shortes way may not be the cheapest because you might cross any number of very expensive nodes.

It seems to me you have to exhaustively find all possible paths (combinations and permutations) between the two nodes, cost each one, and keep the cheapest. That might take a long time in even a moderately large matrix.

Maybe someone has tackled this before??

Sounds to me like the traveling salesman problem (TSP). Instead of various distances you have various costs of each node. Plenty of stuff on that on Google - it's not a trivial problem. The shortes way may not be the cheapest because you might cross any number of very expensive nodes.

It seems to me you have to exhaustively find all possible paths (combinations and permutations) between the two nodes, cost each one, and keep the cheapest. That might take a long time in even a moderately large matrix.

Maybe someone has tackled this before??

The nice thing about Standards is that there are so many to choose from!

Maarten Vergouwen

Ranch Hand

Posts: 60

posted 13 years ago

I formerly worked on a system that calculated the optimum route for a commercial jetliner to fly through the air, and this problem was very much like the one we faced, except ours was in 3 dimensions

The endsolution was to first define a 'possibilityshape', which in a 3d environement mapped onto the earth takes the shape of a banana -- hence the technical term we used, 'bananoid' (but in your case there are no real-world limitations, hence your bananoid would simply be the entire set of cells.)

Then we actually did calculate every possible route via every possible point within the bananoid, whilst still moving towards the target, then compared them for either money or time (which contrary to popular belief are not necessarily the same).

You can also do this using subsets; ie if there are 100 points to go, calculate for the possible next 5, then take the best of those 5, then repeat. Less precise, but much faster.

With some tweaking we managed an average responsetime of 0.8ms on calculations of transatlantic flights.

[ February 23, 2004: Message edited by: Maarten Vergouwen ]

The endsolution was to first define a 'possibilityshape', which in a 3d environement mapped onto the earth takes the shape of a banana -- hence the technical term we used, 'bananoid' (but in your case there are no real-world limitations, hence your bananoid would simply be the entire set of cells.)

Then we actually did calculate every possible route via every possible point within the bananoid, whilst still moving towards the target, then compared them for either money or time (which contrary to popular belief are not necessarily the same).

You can also do this using subsets; ie if there are 100 points to go, calculate for the possible next 5, then take the best of those 5, then repeat. Less precise, but much faster.

With some tweaking we managed an average responsetime of 0.8ms on calculations of transatlantic flights.

[ February 23, 2004: Message edited by: Maarten Vergouwen ]

Jeroen Wenting

Ranch Hand

Posts: 5093

David Weitzman

Ranch Hand

Posts: 1365

posted 13 years ago

Assuming that the cost map only contains non-negative numbers and that you won't be computing the distance between more than a few completely different pairs of cells, I second Jeroen that Dijkstra's Algorithm is the way to go.

With Dijkstra's Algorithm, you'll actually end up calculating the shortest path from a source cell to every other cell. The principle it's based on is simple:

At any given time, you have two sets of cells. One set of cells is already solved --- you know the shortest path from the source to any one of these cells. The rest of the cells are unsolved, but with a critical invariant. The cost of the most expensive path in the solved nodes is less than or equal to the cost of the least expensive path in the unsolved nodes (although you don't actually know what that least expensive path is yet). Note that since the costs are all non-zero, getting far away is more expensive than getting nearby. Thus cells in the solved set are adjacent and branching out from the source cell.

There is always at least one cell in the unsolved set that can be safely added to the solved set. Step one to solving the problem: Which node can be safely added, and what will its distance be? Keep in mind the invariant.

[ February 26, 2004: Message edited by: David Weitzman ]

With Dijkstra's Algorithm, you'll actually end up calculating the shortest path from a source cell to every other cell. The principle it's based on is simple:

At any given time, you have two sets of cells. One set of cells is already solved --- you know the shortest path from the source to any one of these cells. The rest of the cells are unsolved, but with a critical invariant. The cost of the most expensive path in the solved nodes is less than or equal to the cost of the least expensive path in the unsolved nodes (although you don't actually know what that least expensive path is yet). Note that since the costs are all non-zero, getting far away is more expensive than getting nearby. Thus cells in the solved set are adjacent and branching out from the source cell.

There is always at least one cell in the unsolved set that can be safely added to the solved set. Step one to solving the problem: Which node can be safely added, and what will its distance be? Keep in mind the invariant.

[ February 26, 2004: Message edited by: David Weitzman ]

David Weitzman

Ranch Hand

Posts: 1365

posted 13 years ago

It's common in mathematics and computer science for the solution to a slight modification of an easily solvable problem to be quite difficult.

For example, one problem might be to find a minimum cost route from point A to B to C back to A in a graph. That's similar to the problem stated above. It's a problem of finding a shortest path, which can be done efficiently.

The Travelling Salesman Problem is just slightly different: Find a minimum cost route starting and ending at A that passes through both B and C. The difference is just that the order is unspecified, so you can take A-B-C-A or A-C-B-A. For three points that's not so bad, but for N points there are (N-1)! possible orderings. Trying them all will take a long, long time.

*Sounds to me like the traveling salesman problem (TSP). Instead of various distances you have various costs of each node.*

It's common in mathematics and computer science for the solution to a slight modification of an easily solvable problem to be quite difficult.

For example, one problem might be to find a minimum cost route from point A to B to C back to A in a graph. That's similar to the problem stated above. It's a problem of finding a shortest path, which can be done efficiently.

The Travelling Salesman Problem is just slightly different: Find a minimum cost route starting and ending at A that passes through both B and C. The difference is just that the order is unspecified, so you can take A-B-C-A or A-C-B-A. For three points that's not so bad, but for N points there are (N-1)! possible orderings. Trying them all will take a long, long time.

Maarten Vergouwen

Ranch Hand

Posts: 60

posted 13 years ago

Yes, weather is the single most important factor in determining the route.

We received the weather information from KNMI in the so-called GRIB format (binary representation of marsden square data) twice a day.

There was always a big panic when there was no weather info available, since if you dont have it regulations force you to take so much extra fuel aboard that you're losing huge amounts of money with every flight.

Originally posted by Jeroen Wenting:

interesting work Maarten. Did you also incorporate realtime weather data and predictions into the calculations to account for shifting winds at different altitudes and lattitudes?

Yes, weather is the single most important factor in determining the route.

We received the weather information from KNMI in the so-called GRIB format (binary representation of marsden square data) twice a day.

There was always a big panic when there was no weather info available, since if you dont have it regulations force you to take so much extra fuel aboard that you're losing huge amounts of money with every flight.

sever oon

Ranch Hand

Posts: 268

posted 13 years ago

I too see this as the TSP--in fact, it's useful to question whether it's topologically equivalent. In the TSP, a travelling salesman must hawk his wares in 20 cites across the country while paying his own expenses. He wants to keep fuel costs to a minimum (he's driving) and he wants to hit every city at least once.

One thing that occurs to me is that, in the TSP, the salesman can hit any city from any other city and incur the cost of the fuel to drive there. In your problem, if your "salesman" is in cell (1,2) and wants to jump over to cell (3,5), it's not allowed. He can only visit the "directly connected" cities. So they are topologically different problems, and that leads me to think that this problem's solution is not on the order of n! as the TSP is.

Here is a for more eloquent discourse on this algorithm than I could deliver.

Instead, I will get off the topic and suggest a solution to the TSP that others might find interesting. If we accept there's no way to get the One Best Answer to the TSP because it's too much calculatin', is there a way to get close? Yes--through genetics!

How's it work? Ok, imagine that we make a list of the 20 cities that represents the order in which the TS will visit them. For any such list, we can quickly calculate the total distance the TS will cover. Next, imagine we create a pool of such lists, say 500 of them, and each one represents an itinerary that is randomly generated. Next, we sort the pool of these lists from shortest distance to longest. Since we're searching for the shortest possible path, we can define the "value of" a pool as the length of its shortest itinerary.

Here's the genetic part. If we "select out" the longest half of the pool by killing them, we are left with the 250 shortest itineraries and 250 empty spots in the pool. Next, we copy each of the remaining 250 lists to one of the empty spots, but in doing so, we introduce a "mutation"--we alter the copy in some way. We can use a "point mutation", simply swapping two adjacent cities, or a "crossover mutation", in which an entire subsequence of that itinerary is moved...we could even come up with a rule that let's surviving members in the pool "sexually reproduce" and combine their lists. Always reserve a couple of spots for totally new, randomly generated itineraries to introduce a different kind of diversity.

This process of selecting out some percentage of the pool as weak and replacing their spots with mutated copies of the "stronger" members is an evolution. All you have to do now is repeat the evolutions until some criteria is met, say, the value of a pool varies by less than 1% over 5 evolutions.

It is surprising how quickly this algorithm homes in on the shortest path. Whereas the brute force method would take billions and billions of years to run to completion for just 20 cities, this can oftentimes return an itinerary within a few percent of the shortest possible path in under a minute on a modern desktop computer. Furthermore, you can design maps that intentionally have an obvious shortest path built in and test to see how long this algorithm takes to find it...and you'll be surprised at the results.

It helps a lot if you make sure every aspect of this evolutionary algorithm can be finetuned. For instance, instead of killing off half of the population in each evolution, make that number configurable and try different values. Perform the different kinds of mutations probabilistically...a point mutation will occur 50% of the time a copy is made, a crossover 10%. In this way, one copy might incur both, increasing the diversity. And on and on...very fun to play with.

Also, you can modify this approach to great effect by setting up different kinds of pools. The above pool would be a random pool, but you could also set up a matrix pool in which you don't generate lists, you generate matrices that approximately correspond (with some randomness, obviously) to where cities fall geographically on the map. In this case, instead of simply adding up the distance for a list, you use djikstra's algo to get the shortest path between distant cities to set up subsequences that are likely to yield good results.

sev

[ February 25, 2004: Message edited by: sever oon ]

One thing that occurs to me is that, in the TSP, the salesman can hit any city from any other city and incur the cost of the fuel to drive there. In your problem, if your "salesman" is in cell (1,2) and wants to jump over to cell (3,5), it's not allowed. He can only visit the "directly connected" cities. So they are topologically different problems, and that leads me to think that this problem's solution is not on the order of n! as the TSP is.

Here is a for more eloquent discourse on this algorithm than I could deliver.

Instead, I will get off the topic and suggest a solution to the TSP that others might find interesting. If we accept there's no way to get the One Best Answer to the TSP because it's too much calculatin', is there a way to get close? Yes--through genetics!

How's it work? Ok, imagine that we make a list of the 20 cities that represents the order in which the TS will visit them. For any such list, we can quickly calculate the total distance the TS will cover. Next, imagine we create a pool of such lists, say 500 of them, and each one represents an itinerary that is randomly generated. Next, we sort the pool of these lists from shortest distance to longest. Since we're searching for the shortest possible path, we can define the "value of" a pool as the length of its shortest itinerary.

Here's the genetic part. If we "select out" the longest half of the pool by killing them, we are left with the 250 shortest itineraries and 250 empty spots in the pool. Next, we copy each of the remaining 250 lists to one of the empty spots, but in doing so, we introduce a "mutation"--we alter the copy in some way. We can use a "point mutation", simply swapping two adjacent cities, or a "crossover mutation", in which an entire subsequence of that itinerary is moved...we could even come up with a rule that let's surviving members in the pool "sexually reproduce" and combine their lists. Always reserve a couple of spots for totally new, randomly generated itineraries to introduce a different kind of diversity.

This process of selecting out some percentage of the pool as weak and replacing their spots with mutated copies of the "stronger" members is an evolution. All you have to do now is repeat the evolutions until some criteria is met, say, the value of a pool varies by less than 1% over 5 evolutions.

It is surprising how quickly this algorithm homes in on the shortest path. Whereas the brute force method would take billions and billions of years to run to completion for just 20 cities, this can oftentimes return an itinerary within a few percent of the shortest possible path in under a minute on a modern desktop computer. Furthermore, you can design maps that intentionally have an obvious shortest path built in and test to see how long this algorithm takes to find it...and you'll be surprised at the results.

It helps a lot if you make sure every aspect of this evolutionary algorithm can be finetuned. For instance, instead of killing off half of the population in each evolution, make that number configurable and try different values. Perform the different kinds of mutations probabilistically...a point mutation will occur 50% of the time a copy is made, a crossover 10%. In this way, one copy might incur both, increasing the diversity. And on and on...very fun to play with.

Also, you can modify this approach to great effect by setting up different kinds of pools. The above pool would be a random pool, but you could also set up a matrix pool in which you don't generate lists, you generate matrices that approximately correspond (with some randomness, obviously) to where cities fall geographically on the map. In this case, instead of simply adding up the distance for a list, you use djikstra's algo to get the shortest path between distant cities to set up subsequences that are likely to yield good results.

sev

[ February 25, 2004: Message edited by: sever oon ]

It is sorta covered in the JavaRanch Style Guide. |