Win a copy of Secure Financial Transactions with Ansible, Terraform, and OpenSCAP this week in the Cloud/Virtualization forum!

Gary W. Lucas

Ranch Hand
+ Follow
since Jun 25, 2014
Cows and Likes
Cows
Total received
10
In last 30 days
0
Total given
0
Likes
Total received
15
Received in last 30 days
0
Total given
10
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Gary W. Lucas

I've just updated the documentation for the Tinfour open-source Java project.  Tinfour supports the creation of triangular mesh structures from unstructured data sets (collections of randomly-positioned data points). The focus of the project is the Delaunay triangulation. And while that topic is a bit specialized, I believe that there is enough general material in the documentation that visitors here at the Coderanch may find it interesting (and maybe even useful).

You can visit my main documentation page at The Tinfour Documentation Page

In particular, I've updated notes on Natural Neighbor Interpolation which is a technique for creating smooth surfaces from unstructured data such as elevation surveys, weather observations, etc.  I've also posted new material describing the algorithms used by the interpolator (see A Fast and Accurate Algorithm for Natural Neighbor Interpolation)

Finally, I've received a lot of support from folks here at the Coderanch, and I'd like to thank you all for your many helpful suggestions and encouragement.

Gary
3 weeks ago
Thanks for your thoughtful reply.    I think the information you provide will be useful.

I'm looking forward to reading your book.

Gary

Also, a bit of an apology...  After I made my original post, it occurred to me that you might not be able to find the question buried in all that text I wrote.  Sorry the post wasn't a bit more to-the-point.

2 months ago
First off, your book looks like it will be very useful and I am looking forward to reading it.

So far, I've only taken a high-level approach to Python (basically, treating it as a scripting language).  I am kicking around the idea of attempting a more ambitious programming effort. I have a software library written in Java for performing custom data compression and other kinds of analysis on raster data (particularly geophysical data). I would like to implement a compatible Python solution. Since my typical data set involves millions (and sometimes billions) of data values, I am concerned about throughput and efficiency.

And here I find myself in unfamiliar (and confusing) territory.  As I understand it, NumPy is written in C/C++ with Python bindings.  Is this the right model for what I am doing, or can I accomplish it entirely in Python? Some of the processing I perform involves tight loops and repetitive computations over large grids. Most of the math is all arithmetic (not much trig or log-based functions).

My other consideration is that I want my work to be in a form that people can actually use without too much trouble. I would also want it to be reasonably compatible with other tool sets like NumPy, SciPy, etc.

Thanks in advance for your consideration.  And good luck with your book!

Gary
2 months ago
I've posted some new articles on techniques for lossless data compression of raster data. The techniques were implemented in Java and source code is also available.  I've experimented with data compression for both integer and floating-point data types.  So far, I've been mostly working with geophysical information (elevation, ocean currents, surface temperature), but the techniques should be useful for a reasonably broad range of numerical data applications.

If you are interested, you can read more at

Lossless Compression for Raster Data using Optimal Predictors

and

Lossless Compression for Floating-Point Data  

Feel free to let me know if you have any questions or suggestions.

Gary
4 months ago
Recently, I contributed an enhancement to the open-source Apache Commons Imaging project to enable it to read high-resolution elevation data from the U.S. Geological Survey (USGS) Cloud-Optimized GeoTIFF files. GeoTIFFs are a variation of the TIFF image format that includes information that allows their imagery to be applied to map-based applications.

Anyway, I just posted an article and some example Java code describing how to use the GeoTIFF files to create shaded-relief map imagery. The article includes a basic algorithm for lighting and color rendering as well as some pictures that show off the quality of the USGS data. It also includes a bit of background on the GeoTIFF standard and links to some code for a B-Spline surface fitting class implemented in Java.

If you are interested in these topics, you can find the article at  The Gridfour Project's Elevation GeoTIFFs Article
6 months ago
Thanks for your reply.  I think your advice will be useful in my investigations.  I'm thinking about a mobile platform that starts out with some basic behaviors, but learns to optimize them over time.  For example, if I were implementing a walker (and that's just an example, I'm not that good), it would start off with some basic gaits pre-programmed into the system, but would gradually improve them based on experience negotiating its environment.   So that seems to fit the pattern you suggested with the Training and Inference areas.

I look forward to reading your book and, no doubt, significantly revising some of my ideas.

Gary
I have been looking into machine learning applications for small mobile robot applications.  Do you recommendations on the best way to apply deep reinforcement learning techniques with more modest processors?  

I see that you discuss a bi-pedal walker in your Appendix B.  I'm particularly looking forward to reading about that one.

Thanks.  And good luck with you book!

Gary
Good point on the determinant.  It's kind of a reminder that, if the three points define a valid triangle, they can also be used to construct the axes for a 3D coordinate system.

Incidentally, the area computation I posted earlier is algebraically equivalent to the determinant (with an appropriate assignment of variables to the vertices).
11 months ago

One way to check the correctness of the input vertices is to compute the area of the triangle:

   double h = (c.y - a.y) * (b.x - a.x) - (c.x - a.x) * (b.y - a.y);
   area = h/2

This calculation will produce a positive area if the vertices are given in counter-clockwise order, a negative area if they are given in clockwise order.  But if the inputs are bad, it will give a value very close to zero.  So I suggest something like:

  if(Math.abs(area)<1.0e-6){
      throw new IllegalArgumentException("Vertices are colinear or specify a degenerate triangle");
  }

As noted in the above post, IllegalArgumentException is an appropriate exception.   Throwing an Error is seldom a good idea.


Also, a warning about something that might cause you trouble later on...  Your rotation code would rotate the triangle about the origin.  For graphics purposes, this is fine if the coordinates of the triangle are close to the origin. But  if the triangle is not close to the origin, you could end up rotating it right out of your display area.  I have no idea what problem you're trying to solve here, but keep this in mind as a potential cause if you're debugging and get a result you don't understand.

11 months ago
I recently posted a new open-source software project that may be of interest to the
visitors at this forum, particularly those working in Java.

The Gridfour Project is an attempt to create and distribute software tools for
processing raster (grid) based data. Potential applications include geophysical,
scientific, and engineering data. Eventually, I hope the project will provide tools
for rendering, data compression, contouring, surface analysis, and so forth.

The first module written for the Gridfour project is a file-backed data store
that is intended to assist applications in building and managing collections
of raster data that are too large to be kept in memory. I also posted some B-Spline
interpolation utilities and am working on some rendering tools.

You can visit the Gridfour Project at https://github.com/gwlucastrig/gridfour
I've also got a Wiki page at https://github.com/gwlucastrig/gridfour/wiki

Thanks for your time and attention.  I wish you all the best of luck on
your own software endeavors.

Gary
1 year ago
I implemented a Huffman tree just a few weeks ago, so fresh off my own experience (which may different than yours), may I suggest a good old-fashioned array?

Before I start, I will point out that for my application, I had a reasonably small symbol set (256 symbols... many implementations add a special symbol for "end-of-text").  If yours is much larger, my suggestion might be less effective.  Also, I am assuming you are thinking of a classic Huffman tree rather than an adaptive tree (such as Vitter's algorithm).

From the perspective of a tree, we recall that there are two kinds of nodes:  branches and leafs.     I used the same class for both, though I certainly could have used two different classes.   The node class had the rough form:



I declared two arrays.  The first was an array of leaf-node objects, one for each symbol.  This array was used for counting and coding symbols.   The second was an array of the same objects (e.g. it was a shallow copy of the first).  This array was used for sorting leafs by their count and building the Huffman tree.

I looped through my input text symbol-by-symbol.  Since my text consisted of bytes, the byte value could be used to map to an array index.  Using this index to access the proper leaf node from the primary array, I incremented its count.  When I was finished counting symbols, I sorted the second array in ascending order of counts (again, it contained the same node objects as the primary...  it just put them in a different order, the primary array was unchanged).  I removed zero-count elements. Then I applied Huffman's classic algorithm to build up the tree a pair of nodes at a time.  At this point, the second (sorted) array could be discarded).

To encode the text, I again made a pass through the symbols using their byte value as an index into the master array of leaf nodes.  Once I had the leaf, I traversed up the tree from bottom to top.  Of course, that direction of traversal was just the opposite of the bit order for the Huffman code (which goes from top to bottom/root to leaf), so I needed a good way to reverse them.    This might be where using an interesting collection class might come into play.  I used a Java Deque, treating it as a classic stack structure.  As I traversed up the tree (from child to parent), I put each node on the stack.  Once I reached the root node, I simply popped the stack to get back the nodes in the proper order of encoding.


For decoding the Huffman message, I just used the tree structure directly.  It was very efficient.  Sometimes, rolling your own data structure is the best way to go.

Hope this helps.

Gary


1 year ago
First off, I'm really interested in seeing your book.  My own experience with graph theory is limited to the old-school stuff and I would very much like to learn more about how it relates to big data applications.

My question:  How do Apache Spark and Neo4j hold up in terms of performance and memory use compared to some of the non-Java/non-JVM API's?

I ask because in the past I've found that graph-related applications require a lot of objects (edges, nodes, etc.) and that the overhead for Java object construction and memory use gets to be a problem.  Seeing how successful Apache Spark is, it seems likely that they've addressed the issue. I am curious as to what your experience has been.

Best of luck on your book.

Gary

Stephan,

Thank you again for the insights your provided about Maven.  You helped clarify a number of issues that had confused me.

Gary
2 years ago
Thanks Tim!

Recognizing that the Tinfour project has a relatively narrow user base, I tried to provide enough information to make it accessible to those developers who would be interested in using it.

When you refer to "pretty" documentation,  I take it you saw the project wiki pages?  Since Triangulated Irregular Networks (TINs) are kind of a specialized topic, I felt it would help potential developers if I provided background information on some of the theory and applications. So I've started posting wiki pages.  Some of them are better than others, but my favorite would have to be the one on Robin Sibson's Natural Neighbor Interpolation algorithm.  I think that Natural Neighbors is one of the most under-appreciated interpolation techniques out there.  Sibson died just a couple of years ago, so I have no idea what he would have thought of my work.  But I think that anything I can do to make the Natural Neighbors technique more available to potential applications is effort well spent.  
2 years ago
Stephan,

Thanks for the assessment and for taking the time to put together a Pull Request.  I will look at it this weekend.    I spent a lot of time looking at the documentation, but I came away with the feeling that there are fundamental Maven use patterns  that I am just not seeing.  So your updated POMs are especially welcome.  

You caught one issue in particular that concerns me. The core module is not supposed to have dependencies on anything except the standard Java API.  I'll have to figure out how the commons-math3 dependency  got in there.

Gary
2 years ago