Saturday, February 22, 2014

Begin it where...

About a month ago, I started field testing a set of text mining algorithms which combines map nodes with a library of English language synonyms.  I won't go into detail of the reasons for the original intent of this work, but field testing has led me to "the poem".  I thought: why not see if I can solve what hasn't been located by as many as 30,000 others?

I am now more than just intrigued by this funny little array of carefully crafted words.

The challenge is to find "context" within such a set of words, decipher the context with synonym possibilities, and assign probabilities as they relate to proximity cluster sets of map nodes within a distance tolerance.  In this case, the author suggests there is also a sequential context.  To this extent, interpretation of the results for sequential context must be applied such as with a TSP algorithm.

To start, I scanned his two books and converted them to text.  I used a thesaurus software for synonym clusters.  I used GATE and Mathematica (I also dabbled in Octave for specific fuzzy relational operations) and Groovy as my framework.  Interpreting unstructured text and inferring context is a field I have been working in for a few years, at first indirectly, and more recently directly.

I then tagged the poem with location contexts and began comparing these with "clues" suggested to be sprinkled across the stories.  Predictable patterns emerged, including specific focus on certain keyword relationships from the poem like "halt", "the end", "nigh", "cease", and the mysterious Omega symbols from the books (colophon).  I observed heavy undertones of Turkish fatalism, but not surprising from a man who was facing his inevitability with witness of what disease had done to his beloved father.

I've been tempted to enrich my results by crawling blogs where curious treasure hunters have cleverly opened up comments to aggregate ideas.  As I started down this path, I realized that introducing so much noise would lead to entropy.  But, a word to the wise:  search terms and general geo-references end up on blog stats.  If you came here through a search engine, you're tipping me to your ideas.  It's no different for the other longer-lasting bloggers who provide such helpful tips and insights (no offense, I get "clever").

The gentleman who has challenged me has architected a nasty little set of attractive red herrings which has sent many families into a thoughtless pattern of searching the wilderness inside 382,894 square miles.  Thankfully he decided to narrow the search a bit more by restricting us to the USA, and then Montana, Wyoming, Colorado, and New Mexico.  And even more within a range of elevation: above 5K and below 10,200 feet.

Using elevation and boundary attributes with the map nodes helped to limit certain cluster prospects, but I needed to be cautious with this approach because "the end" node is in the target range, not necessarily the beginning.  After reviewing these results, it started to appear as no sequential pattern was emerging.  The clever gentleman appears to be using more visual queues than those recorded into well-known mapping software.  I now fall into his classification of armchair hunter.

Here are several of my algorithm results which turned up some interesting clusters.  I do this for no other reason than to describe the results of my math.  I will keep playing but I have found separate reward in locating targets from synonym-encrypted sources.

... where warm waters halt
- Warm Springs Station, WY, Pony Express
- Warm Springs Cliff, WY
- the end of Luke Rd, at Yellowstone Lake

... canyon down
- Sinks Canyon

... too far to walk
- Sevenmile Hole trail
- Ninemile trail

... the home of Brown
- Hazelton, WY
- Brown's Pass, Hartsel, CO
- Brown's Park, WY
- Browns Mountain, WY
- Browns Canyon near Sinks, south of Lander

... no place for the meek
- Wolf Trail

... the end is ever drawing nigh
- Sand Draw Creek
- Sand Draw Rd
- Dinosaur Hill

... no paddle
- Skull this or that (scull)

... brave and in the wood
- Sherwood Point, WY

... worth the cold
Cold Spring

Other clues:

Children have an advantage.
- Childs Creek, WY
- Disneyland Rd, near Teacher's Rd in Pavillion, WY  - what the hell is this place except an EPA risk?

I have a place in mind where children would enjoy that I'm going to check which required additional non-English synonyms.  Most of the clues can't be found on the map nodes I have.  Not sure if the gentleman knew this when preparing his clues, but if he did... extremely clever, and it would speak to why no one has found it yet.

No comments:

Post a Comment