Monday, May 6, 2013

Pathfinding: A step by step approach

Two things about pathfinding

Start with the simplest

It is hard to picture an optimal roue that will not start by minimizing distance through real space.  I will start with that fairly simple cost function, then look at fuel and delta-v profiles.

The problem of scale

Most of the graph problems I have dealt with so far have head a few dozen nodes, and so are susceptible to exhaustive depth first.  With thousands of nodes, rapidly expanding to tens of thousands, that goes out the window.  Not only do we have too many nodes for that, but they live in a database and should not be schlepped into memory without a good reasons.  That means I am going to have to use some very directed heuristic like A* or something that takes more direct advantage of the nature of the map.

This should make for interesting problems in Object Relational Mapping, since convention database problems simply do not work this way.

Starting small in data terms

I think that I would do well to set up a full map for a subset of the data - a 10 or even 5 parsec subsector centered on Sirius.  That would let me test the data model to make sure it will do what I need it to before there is a lot of data processed.

Friday, May 3, 2013

User Functions 1: Pathfinding

We have been talking about maps a lot in this blog.  What's the first thing we want to find out from a map?  If I am here, and I want to be there, what route to I follow in my travels?

We have set down a number of rules for travel.  Interestingly enough, none of them say much about the specific capabilities of any ship (at least one that can travel significant distances in space)
  1. You may jump from star A to star B if the mass of B in solar masses divided by the square of the distance from A to B squared is greater than 0.1.
  2. Respect for conservation of momentum and potential energy.
  3. Sufficiently flat space.
  4. Velocity vector at moment of departure pointed directly at target position.
So, if I want to get from A to B, how do I  plot a route.  Well, route planning using graph theory is simply enough a matter of assigning a cost to each edge (clearly directed in this case) and then minimizing the cost to traverse a set of edges from joining A to B - if set a set exists.  The trick is the costing function.

There are static elements to the problem.  Rule 1 is simple enough; the existence of an edge depends on destination mass and that is that.  The detailed cost of rules 2 though 4, however, depend on decisions about technology and choices made on how to use it. 

Why?  Rules 1 and 2 together determine how far we have to go through real space to get to the point to which we can jump to another star, and how far we are out from that star when we arrive.  Rule 4 decides where that point is.  Simple enough, but how much that then costs depends on the delta-V of our ship, and how we want to trade off between time and fuel consumption.  Are there fuel sources en-route?  Do I have to stop to use them, or will they rendezvous with me?   What contingency do I want for emergency? 

How about other consumables?  How do we cost food, water, air - and of course urgency.  It's an interesting set of problems.

Rule 4 has another interesting set of implications.  Lets say I am going from A to B to C.  If A-B-C is very nearly a straight line, I do not have to spend very much fuel in system B to line up my vector to jump to C.  I can (it's a choice) accelerate all I want, or even do mid-flight turnover for the whole flight while passing through B.  But it is a choice.  If on the other hand the angle ABC is highly acute I have to adjust my vector while passing through system B from the A-B direction to the B-C direction.  Not only are there costs to this of itself, my the point at which I mitigate that cost is back in system A when I decide on the magnitude of the AB vector.  Hairy stuff.

In fact, it might be a good idea to come up with a "rule 5" - a correct velocity at which to be traveling when the jump is made.  This at least has the virtue of eliminate a set of optimizations.  Something to think about.

Thursday, May 2, 2013

Swamp Visions

You all know the old saw "When you are up to your ass in alligators, it's hard to remember you are here to drain the swamp."

I've been zooming in on details of star catalog processing for a few weeks now.  I think it is time to re-focus on how I will use this information, and what products will facilitate that use.

So what are the drivers?


My regular hobby, as you will rapidly guess from visiting my other blogs, is wargaming.  I would like to run a campaign game, by  invitation, fora number of my friends.  I would like them to be able to set policies and make plans that will shape my future history, and apply rules to resolve the shared results of the many players' interactions. Practically speaking, it would be great to have a website where players could log in, see what is going on, and issues their orders all at a conveniently high level, with lots of analysis tools.


With retirement starting to at least be visible peeking over the horizon (half a dozen to ten years, almost down to single digits) I need some way to spend my time that does not involve using TV to cause brain damage.  Like most people who like to read, I have a fantasy that I will be able to write and will enjoy doing it.  I owe it to myself to find out.  One of my favorite genres is Science Fiction, and for really great SF a well-though-out world is essential.  In the old days, this would consist of binders of hand written notes.  What I would like to do in this century is leverage the same sort of tool (or exactly the same tool) as I use to run games as a master reference for an SF future history in which I can set my novels.


Huh?  Well, I am in the computer business; I write code for a living, and have for (brings up calculator) nigh 35 years this fall.  Its been a while since I tried to keep at the "bleeding edge" of development work, but I do depend on support work to pay the mortgage.

If we look at the systems written in the last five years, and project forward to the "legacy applications"  of tomorrow the key technology is Java in a web context using frameworks such as Struts or Spring.  Java Server Faces has solid purchase as a "front end" technology and Hibernate has a big chunk of the persistence space.  Note that I am not trying to predict the prime hot development environments of tomorrow; I need experience with technologies where, if I get solid skills now to complement my existing expertise, I will be well positioned to support those applications that corporations really have to keep going for at least another decade.  That's right, kids, Java is the COBOL of tomorrow.  Eventually it will be obsolete, but there is a solid mass of code its owners can't afford to turn off any time soon.

The obvious thing to do with the tool described above is to code and deploy it using a cross section of those technologies.  It will cost a bit and take some time, but since dev tools these days are free, and web hosting is just a couple of hundred a year, I should be able to put out a  muti-user product that will let me develop and showcase my skills while entertaining me, my friends, and hopefully my readers.

In future posts, I will get more into details.  Right now, I am prepping for a trip up the Danube, and I need to practice my useful Magyar phrases.

Saturday, April 27, 2013

XHIP counts

So, my census results for the XHIP catalog, for a 100 parsec wide cube centered on Sirius.

In the cube: 10,117 stars (plus sol, not in the file)
  • 811 of those are multiple star systems, but not variable stars
  • 3835 are listed is variable, but do not list components
  • 2101 are both variable and multiple - of course, that may include eclipsing binaries.
There are 50,760 stars more outside the cube, and additionally 57,078 with no distance information.

Monday, April 22, 2013

Firming up the sector model

First, just to note the XHIP computed heliocentric galactic-coordinates position for Sirius is (X,Y,Z) is -1.8,-1.9,-0.4.

Second, why am I so hung up on getting as wide a space as possible?  Fundamentally, because while ships move outwards in a linear way, stars pile up with the cube of the distance. with the rules we have discussed so far, and technologies where 10-20 milli-Gs (a few cm S^-2) is a lot of acceleration we are talking 3 months or so to transit from one average star to the next - a distance of a couple of parsecs.

So, if 100 parsecs is 50 jumps (optimistic, not a straight line) and for each jump we spend 3 months traversing the system to position for the next jump (reasonable in terms of time, but optimistic because we will have to "stop for gas") that's 150 months or more than 12 years to reach the edge of the map.  So I am worrying about nothing, it would seem.  Cut it in half - an 8th the number of stars - and you still have 6 years to the edge.  And a lot of those systems are going to be junk not worth visiting - we can probably store an adequate amount of information about it in a dozen bytes.

Another note about 100 parsecs - the galactic lens is only about 300 parsecs thick, so the 100 parsec radius would be 2/3rds the thickness of the galaxy!.

My next census will be a 50 parsec radius cube (so still 10^6  cubic parsecs) divided into sectors of radius 10 (so 125 cubes 20 parsecs across) centered on Sirius. 

Sunday, April 21, 2013

Stellar density and star catalogs

I've done a little analysis of XHIP that illustrates the issues with any current parallax catalog.

I took the stars with good distances in the new reduction, and determined which 5-parsec thick "shell" the fell into around sol.  While the number of stars in each shell is stable, of course the volume of each shell is greater, and the density drops off considerably.

This should not come as a surprise to anyone, of course.  Visual magnitude is a key factor in being able to make any observation about a star, and that (of course) drops off with square of distance.

With the current state of the art, that means that any catalog I produce which attempts to reproduce the "real" density of stars in local space will require a fair amount of guesswork.  Keep in mind, my objective is to produce something suitable for a game or for an SF story.  Story lines can be preserved by emphasizing the things we know, and playing down what what is made up.  For a game, the MST3K Mantra comes to mind.

How many stars do we need to backfill?  Well, taking the local density as "correct" - which is a poor base of itself, and will need refinement, we should have 361179 stars out to 105 parsecs, and only have distances for 24183.  This means that we will need to generate (or at least assign distances for) 336996 stars - overall, 93% or so of objects. 

Graphically by distance this looks like:

All of which will be invalidated by the ESA in about five years, but I'll survive.

A side note: if I want my space to be a 200 parsec  cube centered on Sirius, I should re-work this to 180 parsecs radius or so.

Saturday, April 20, 2013

Jump distance rules

I've been considering an interesting way to limit distance of jumps (for games and SF of course)  for a while  - something simple and not particularly concerned with units.

A simple approach would be to use  a relationship like:

Where M is the mass of the destination, and r is in parsecs.

Looking at a few examples, this would let us jump to Sirius (mass of primary+companion roughly 3 solar masses) over about 5.5 parsecs.  One could jump to our Sol  system from as far away as 3.2 parsecs.  To reach a A small M main sequence star like Gliese 185 one would need to be within 1.4 parses.

We are not all that close to any very massive stars; however this limit does have an interesting asymmetry; there will be many jumps that are possible in one direction buy not the other.

Saturday, April 13, 2013

Thoughts on ships, stars and science fiction

I've been trying to think of ways to combine maneuverability and fuel capacity in a single ship.  Here's a concept that might work.  It assumes that the only propulsion system that really does the job is some sort of very high exhaust velocity "atomic rocket".  Also, its clear that our travel system will need a lot of fuel; but size (length particularly) is opposed to maneuverability.  To be maneuverable, you want a sphere with no more in it than you need to fight.  To travel for long distances and maintain the craft you need a lot of impedimenta.

Here's my compromise:
The main propulsion is our probably-radioactive "rocket".  The combat module has the main engineering systems, significant fuel, main weapons, and enough habitat and life support for enough crew to fight the ship in minimal comfort.

The Fuel module is just what it says on the tin, and comes with rotating habitats, extended life support, and support and maintenance as well as reserve combat crew.  Landers, shuttles, expendable reloads, the "FTL space jump drive if it is a separate system"  and the like would also be part of that module.

The combat module would "park" the logistics module well away from the expected action, un-dock, and fight with in a low mass configuration with minimized moment of inertia.

I've also pulled down the All Sky catalog from Vizier; to save myself some time I decided to let the server calculate the galactic coordinates.

Now, if you can think back to reading SF in the 1970s, consider: in what decade you would have expected the above sentence to be written; perhaps what century?

Also in the realms of no longer SF, it looks like about three years from now I am going to need fibre-op and a couple of terrabytes of disk space. Cool or what?

Friday, April 12, 2013

Back to start?

The more I work with the HYG data (and star catalogs in general) the more I think I should go back to first principles with a new starting catalog, such as Kharchenko's All Sky Compiled Catalog.  It is a compilation from all the standard references, but also contains some useful additional record flags, and also goes down in some cases to 14th apparent magnitude, which will aid in the "backfill" objective. 

It does not have all the "name" cross references from HYG, but I an use the HIP catalog number to link that in.  It will doubtless have some errors, but all star catalogs have errors and I am coming to understand that (especially as you dive into the low magnitude objects) it is very difficult to say with absolute confidence that two catalogs that offer objects at a given location really are talking about the same star.

Also, since I will not be integrating catalogs myself, I can produce galactic references from a single self-consistent source. It's important to keep in mind that my ultimate objective is a fictional construct that it is as constant with the real world as I can make it, but with no regrets were it is not.  Science marches on, no matter what level of alignment with current knowledge I manage.

Something to keep in mind: If I want to go out to 100 parsecs (mostly at that extreme by photometric parallax) and get the faint end of main sequence dwarfs I need to get down to 20th apparent magnitude, which exceeds Kharchecko's low end.

Of course, using the estimate of .14 stars per parsec^3, that's over 586,000   systems.  Bit much to be useful; 50 parsecs would reduce to 73,000 stars, which is a lot more manageable. 

Sunday, April 7, 2013

More Hipparcos data

The Strasbourg Astronomical Data Centre has the Hipparcos multiple and variable data ; I'll get that merged into my own reference catalog, and see where that leads me.

Friday, April 5, 2013

Washington Double Star index included

I was surprised to find only 105 hits, but then these are visual binaries, which can very easily mean apparent binaries.  I'll try adding cross indexes on alternate catalogs, them move on to the MCS data.  A smaller dataset, but he fact that they are fully computed may make for a higher hit percentage.

Thursday, April 4, 2013

Some details on spin gravity side effects

On the spacefuture website

Thoughts on shuttle technology

Here's an interesting engine and spaceplane concept - the Skylon project from Britain. There are also some nice flight concept CGI clips floating around youtube.  The basic idea is to build velocity with an airbreathing engine before climbing to orbit with rockets; the engine itself is multi-mode.

While I am not particularly saying that this is "the" way shuttles will work in my future, it is interesting to see that research into viable reusable orbital technology is not dead.

Tuesday, April 2, 2013

And a curveball from reality

I've identified three good sources for multiple star data.  Two, importantly, have "HIP" cross references.
 The downside of these discoveries is that some of the entries I have already processed may be in those catalogs, with no notation of multiplicity in it's HYG. data line.  Such is life;  I will start the splitting by incorporating the cross reference in the names along with the critical annotations in the ongoing "rejects" data.  Incorporating these data into the re-run to exclude multiples from "simple solitary" processing will be simple enough, although I will end up with a lot more "leftovers" when I am done.

It is also time to start thinking about the file structure and database structure to capture the orbital data.

Almost 3/4 done...

but its the easy 3/4.  I have masses for 18,170  stars, leaving 6,688 catalog entries yet to solve; but I have also used up all the the simple, Astro 101 style, spectral types.  That's not the end of the world, but it does mean I need new techniques to process the single star entries, and also that it's time to start thinking about how to process the binary stars.  In fact, I probably need to look for a binary catalog to cross-index.

Monday, April 1, 2013

More progress

Well, the color index computations worked well, although I do have to extend my table for interpolation of B-V a bit.  I am doing a very simple linear interpolation; I could do something fancier with a polynomial but this is really good enough.

I do have to take a quick step back - I forgot to include a sequence number (for when we hit multiple star systems) and the computed spectral type in the final output sequence.  I'll get that in tonight I expect, the re-runs will be quick.

Saturday, March 30, 2013

The problem of radius

While I have data that I can set up in a simple table for Main Sequence stars, I do not have similar data for non-main-sequence.  Fortunately, much of our data has color index values.  That means I can use the formulas on  this site to compute radius directly.

Friday, March 29, 2013


In addition to the 7176 stars massed in the first pass, I looked at a few samples where the spectral type is present and the class is absent, and decided that, based on absolute magnitude, they are close enough to main sequence.  Running with that massed 9054 stars, leaving 8408 cases.  The next most common well-formed cases are class III.  Using the same sort of interpolation table will let me deal with almost 1000 more cases.

Lets look at mass

First, I have applied a simple filter to the data, I have drastically reduced the bulk of the HYG dataset; by restricting myself to a 100 parsec radius from sol I get down to 24,638 objects.  This is convenient since, among other things, it lets me load data into a spreadsheet to play with various formulas.

Now I can work through the first set of mass and luminosity calculations. The data has 7176 well-formed solitary spectral entries for class V (main sequence) starts.  I will use a table from Gillet's World Building to interpolate the mass and luminosity.  There are more sophisticated techniques I could apply based on bolometric correction and relationships between main sequence stars, but in terms of fitness for use this should be good enough.  If I get better mass data (say from exosolar planet search) I will backfit it.

Tuesday, March 26, 2013

Reminder to self

Update the file splitter to exclude (with a rejects file) stars with bad data per this post.

For my next trick...

The file splitter program now works. The next two steps will be
  • An analysis program to take one of the "rejects" file and work out what spectral types are most common.  This should help drive "bang for buck" planning on what to process next.
  • Many entries, I already know, are very clean spectral-type/number/class with no qualifiers.  I will start with those as the first to process.  In order to break up conversion of spectral data into mass and luminosity, I will start with a single class (probably class V since that's "us").

Sunday, March 24, 2013

Off to a start

I am making progress on splitting the data file.  I am using opencsv for my very lightweight file handling needs, and the latest eclipse as a platform.  Very utilitarian Java; so far I have produced a name file with 234,512 distinct names - most stars, o course, having multiple entries.

Friday, March 22, 2013

A file processing approach

You can tell I am old by the fact I am going to do the initial processing by files.  Remember, though, the objective is to produce a set of files that can be loaded into a relational database.

The first step (phase 0) will be to split the full Hyg2.0 file to extract the name information and location information for loading.  This will let us concentrate on a file of core fields that will describe the spectral information from which mass and luminosity will be derived along with simplified spectral types.

After that, the core fields file will be processed in a series of steps each of which will solve one processing problem and generate an output file with the stars solved in that step.  Stars that cannot be solved will be output to an ever-decreasing rejects file which will be input for the next phase.

Once no rejects are left, all the phase outputs will be loaded into the database.

Monday, March 18, 2013

More about fuel

Lets take the table from the last post, and work with the Tsiolkovsky equation to get some concrete numbers.  First we have to nail a couple of data points down.
  • Engine - we will assume an exaust velocity of 500,000 m/s.  This is very hot, but not entirely insane.
  • Mass -  We will assume we are delivering something of the mass of a Kirov battle-cruiser.  That's 28,000 tons, or near as matters 3x10^7 Kg.
So with using the table from the last post, and having confidence in our high school algebra:

Acceleration, G Days Fuel Mass, Tonnes Mass Ratio
2.00E+00 6.40E+00 7.65E+13 2.55E+09
1.50E+00 7.38E+00 4.20E+12 1.40E+08
1.00E+00 9.04E+00 1.34E+11 4.48E+06
1.00E-01 2.86E+01 3.78E+06 1.27E+02
1.00E-02 9.04E+01 1.09E+05 4.63E+00
1.00E-03 2.86E+02 1.87E+04 1.62E+00
1.00E-04 9.04E+02 4.97E+03 1.17E+00
1.00E-05 2.86E+03 1.49E+03 1.05E+00

Of course,what this really underscores is the un-realism of SF in which large warships plunge through solar systems at multi-G accelerations for days on end.I am not sure I know how important this is to me, though.

Sunday, March 17, 2013

Continuous Acceleration, Time and Fuel

This table sets aside looking at the mass of fuel directly, and just looking at how long it takes to go 10 AUs at various accelerations, the following table is just the Brachistochrone equation.

Distance, AU Acceleration, G Hours Days Delta-V, G-Hours
10 2.0E+00 153.484 6.39515 306.967333
10 1.5E+00 177.228 7.38449 265.8415085
10 1.0E+00 217.059 9.04411 217.0586827
10 1.0E-01 686.4 28.6 68.63998234
10 1.0E-02 2170.59 90.4411 21.70586827

Saturday, March 16, 2013

Systems and the "Hyg" database

There are two sorts of analysis it occurs to me that I have not done yet with these data.
  1. I have not attempted to group the individual records into "systems".  I can tell from the spectral data that many "star" entries are close multiples, but I have not looked to see how many have even identical X/Y/Z coordinates let along those indicating close grouping.  In terms of data reduction I think that is my first step.
  2. I have not looked how I will want to look at the reduced data when I am done, and to wold-build and record non-generated information on to of that.  No knowing a good chunk of  the endpoint is premature and I need to address it.

Thursday, March 7, 2013

Better processing software

Data reduction is what bogged me down last time.  How can I improve on it?  Here's my thinking right now.

Parts that work:
  • Java.  While I considered Perl for a while, I now know at least enough about regular expressions in Java to be dangerous.
  • Breaking the problem down, and solving the easy parts first.
Parts that I can improve:
  •  Premature data modelling.  I was focused on targeting a relational data model.  While there will ultimately be a relational database, concentrating on what I had specified for that model and generating load files for it was putting the cart before the horse.  Derive an object model first, tweak as needed, and then decide how the data should sit relationally.
  • Lack of a scorecard.  While I had processed the data in order of increasing difficulty, the capability to do so was not designed into the rather ad-hoc software.  The program needs a serious redesign to filter and attack the problem in a more systematic way.  Being able to measure progress is a side-effect of a good design in that respect.
If I re-attack the software design with those criteria in mind I should be able to sustain the effort to the end -- and to make something that is easier to return to after a break.

Wednesday, March 6, 2013

Getting re-inspired

I've just been re-reading Jack Campbell's The Lost Fleet: Beyond the Frontier: Dreadnaught.  A tad formulaic, but good fun and reminding me how much fun a science fiction world can be.

I've mused on this a couple of times recently, and there are two points to re-starting this project that I keep coming back to:
  • The amount of data to process is very large and for the values I need for my plans so far require extracting one of the most indirect values (mass) for every data point.  It finally occurred to me that I could us a proxy that we have for every object - luminosity.  OK, so there is no reason to believe that luminosity relationships would make it possible to travel faster than light; but there is no reason to think that gravity would either, it just feels more like it should.  Interestingly enough, luminosity is also the relationship Pournelle uses for his Alderson drive.  I don't think I run much risk of ending up with the same results however.  The important thing is that it will get me to having "jump maps" faster.
  • As far as data reduction goes, I waffle back and forth on JAVA vs a scripting language such as Perl.  I will stick with Java for now, but perhaps more on this later.
So, consider this project re-started.