JustOne: When a Pivot Is Not Possible

February 4, 2017

CopperEye hit my radar when I did a project for the now-forgotten Speed of Mind search system. CopperEye delivered high speed search in a patented hierarchical data management system. The company snagged some In-Q-Tel interest in 2007, but by 2009, I lost track of the company. Several of the CopperEye senior managers teamed to create the JustOne database, search and analytic system. One of the new company’s inventions is documented in “Apparatus, Systems, and Methods for Data Storage and/or Retrieval Based on a Database Model-agnostic, Schema-Agnostic, and Workload-Agnostic Data Storage and Access Models.” If you are into patent documents about making sense of Big Data, you will find US20140317115 interesting. I will leave it to you to determine if there is any overlap between this system and method and those of the now low profile CopperEye.

Why would In-Q-Tel get interested in another database? From my point of view, CopperEye was interesting because:

  1. The system and method was idea for finding information from large collections of intercept information
  2. The tech whiz behind the JustOne system wanted to avoid “band-aid” architectures; that is, software shims, wrappers, and workarounds that other data management and information access systems generated like rabbits
  3. The method of finding information achieved or exceeded the performance of the very, very snappy Speed of Mind system
  4. The system sidestepped a number of the problems which plague Oracle-style databases trying to deal with floods of real time information from telecommunication traffic, surveillance, and Internet of Things transmissions or “emissions.”

How import6ant is JustOne? I think the company is one of those outfits which has a better mousetrap. Unlike the champions of XML, JustOne uses JSON and other “open” technologies. In fact, a useful version of the JustOne system is available for download from the JustOne Web site. Be aware that the name “JustOne” is in use by other vendors.

image

The fragmented world of database and information access. Source: Duncan Pauly

A good, but older, write up explains some of the strengths of the JustOne approach to search and retrieval couched in the lingo of the database world. The key points from “The Evolution of Data Management” strikes me as helpful in understanding why Jerry Yang and Scott McNealy invested in the CopperEye veterans’ start up. I highlighted these points:

  • Databases have to be operational and analytical; that is, storing information is not enough
  • Transaction rates are high; that is, real time flows from telecommunications activity
  • Transaction size varies from the very small to hefty; that is, the opposite of the old school records associated with old school IBM IMS system
  • High concurrency; that is, more than one “thing” at a time
  • Dynamic schema and query definition

I highlighted this statement as suggestive:

In scaled-out environments, transactions need to be able to choose what guarantees they require – rather than enforcing or relaxing ACID constraints across a whole database. Each transaction should be able to decide how synchronous, atomic or durable it needs to be and how it must interact with other transactions. For example, must a transaction be applied in chronological order or can it be allowed out of time order with other transactions providing the cumulative result remains the same? Not all transactions need be rigorously ACID and likewise not all transactions can afford to be non-atomic or potentially inconsistent.

My take on this CopperEye wind down and JustOne wind up is that CopperEye, for whatever management reason, was not able to pivot from where CopperEye was to where CopperEye had to be to grow. More information is available from the JustOne Database Web site at www.justonedb.com.

Is Duncan Pauly one of the most innovative engineers laboring in the database search sector? Could be.

Stephen E Arnold, February 4, 2017

Free One Year Old, 13 Page IBM Ebook about Cognitive Computing

February 3, 2017

I like short books. But 13 pages? If you are thirsty for knowledge about IBM’s cognitive computing push, you will want to navigate to this link and download The Promise of Cognitive Computing, originally published in February 2016. Timely. My undergraduate honors essay was about five times longer than this IBM ebook.

What’s in the scholarly gem? Here’s a sampling of the topics:

  • Technology transforms
  • The opportunity is providing insights
  • The solution: Businesses built on cognitive computing
  • The opportunity for start ups
  • Six real life examples of cognitive computing products
  • Six steps for developing a cognitive computing product

There are two sidebars filled with useful information; for example, a definition of unstructured information and four reasons to build a cognitive computing business using Watson. There is also a link to a 30 day free trial of Watson.

Interesting. What’s happened in a year of cognitive computing? Not enough to warrant a second edition. Apparently the 19 consecutive quarters of declining revenue has blunted some of the marketing enthusiasm for ebooks.

Stephen E Arnold, February 3, 2017

IQwest IT Steps Up Its Machine Translation Marketing

February 3, 2017

Machine translation means that a computer converts one language into another. The idea is that the translation is accurate; that is, presents the speaker’s or writer’s message payload without distortion, odd ball syntax, and unintended humor. What’s a “nus”? The name of a nuclear consulting company or a social mistake? Machine translation, as an idea, has been around since that French whiz Descartes allegedly cooked up the idea in the 17th century.

I read two almost identical articles, which triggered by content marketing radar. The first write up appeared in KV Empty Pages as “Finding the Needle in the Digital Multilingual Haystack.” The second article appeared in the Medium online publication as “Finding the Needle in the Digital Multilingual Haystack.”

image

image

Notice the similarity. Intrigued I ran a query for IQwest. I noted that the domain name IQwest.com refers to a bum domain name. I did a bit of poking around and learned that there are companies using IQwest for engineering services, education, and legal technologies. The IQwest.com domain is owned by Qwest Communications in Denver.

The machine translation write up belongs to the IQwestIT.com group. No big deal, of course, but knowing which company’s name overlaps with other companies’ usage is interesting.

Now what’s the message in these two identical essays beyond content marketing? For me, the main point is that a law firm can use software translation to eliminate documents irrelevant to the legal matter at hand. For documents not in the lawyer’s native language, machine translation can churn out a good enough translation. The value of machine translation is that it is cheaper than a human translator and a heck of a lot less expensive.

Okay, I understand, but I have understood the value of machine translation since I had access to a Systran based system years ago. Furthermore, machine translation systems have been an area of interest in some of the government agencies with which I am familiar for decades.

The write up states:

building a model and process that takes advantage of benefits of various technologies, while minimizing the disadvantages of them would be crucial. In order to enhance any and all of these solution’s capabilities, it is important to understand that machines and machine learning by itself cannot be the only mechanism we build our processes on. This is where human translations come into the picture. If there was some way to utilize the natural ability of human translators to analyze content and build out a foundation for our solutions, would we be able improve on the resulting translations? The answer is a resounding yes!

Another, okay from me. The solution, which I anticipated, is a rah rah for the IQwest machine translation system. What’s notable is that the number of buzzwords used to explain the system caught my attention; for instance:

  • Classification
  • Clustering
  • N grams
  • Summarization

These standard indexing functions are part of the IQwest machine translation system. That system, the write up notes, can be supplemented with humans who ride herd on the outputs and who interact with the system to make sure that entities (people, places, things, events, etc.) are identified and translated. This is a slippery fish because some persons of interest have different names, handles, nicknames, code words, and legends. Informed humans might be able to spot these entities because no system with which I am familiar is able to knit together some well crafted aliases. Remember those $5,000 teddy bears on eBay. What did they represent?

The write up seems to be aimed at attorneys. I suppose that group of professionals may not be aware of the machine translation systems available online and for on premises installation. For the non attorney reader, the write up tills some familiar ground.

I understand the need to whip up sales leads, but the systems available from Google and Microsoft, to name just two work reasonably well. When those systems are not suitable, one can turn to SDL or Systran, to name two vendors with workable systems.

Net net: My thought is that two identical versions of the same article directed at a legal audience represents a bit of marketing wonkiness. The write up’s shotgun approach to reaching attorneys is interesting. I noticed the duplication of content, and my hunch is that Google’s duplicate detection system did as well.

Perhaps placing the write up in an online publication reaching lawyers would be a helpful use of the information?  What’s clear is that IQwest represents an opportunity for some motivated marketing expert to offer his or her services to the company.

My take is that IQwest offers a business process for reducing costs for litigation related document processing. The translation emphasis is okay, but the idea of making a phone call and getting the job done is what differentiates IQwest from, for example, the GOOG. I remember Rocket Docket. A winner. When I looked at that “package,” the attorneys with whom I spoke did not care about what was under the hood. The hook was speed, reduced cost, and more time to do less dog work.

But the lawyers may need to hurry. “Lawyers Are Being Replaced by Machines That Read.” Dragging one’s feet technologically and demanding high salaries despite a glut of legal eagles may change the game and quickly.

Plus, keep in mind FreeTranslations.org. You can get voice translations as well as text translations. The increasingly frugal Google has trimmed its online translation service. Sigh. The days of pasting lengthy text into a box is gone like a Loon balloon drifting away from Sri Lanka.

There are options, gentle reader.

Stephen E Arnold, February 3, 2017

Little New Hampshire Public Library Takes on Homeland Security over Right to Tor

February 3, 2017

The article on AP titled Browse Free or Die? New Hampshire Library Is at Privacy Fore relates the ongoing battle between The Kilton Public Library of Lebanon, New Hampshire and Homeland Security. This fierce little library was the first in the nation to use Tor, the location and identity scrambling software with a seriously bad rap. It is true, Tor can be used by criminals, and has been used by terrorists. As this battle unfolds in the USA, France is also scrutinizing Tor. But for librarians, the case is simple,

Tor can protect shoppers, victims of domestic violence, whistleblowers, dissidents, undercover agents — and criminals — alike. A recent routine internet search using Tor on one of Kilton’s computers was routed through Ukraine, Germany and the Netherlands. “Libraries are bastions of freedom,” said Shari Steele, executive director of the Tor Project, a nonprofit started in 2004 to promote the use of Tor worldwide. “They are a great natural ally.”… “Kilton’s really committed as a library to the values of intellectual privacy.

To illustrate a history of action by libraries on behalf of patron privacy, the article briefly lists events surrounding the Cold War, the Patriot Act, and the Edward Snowden leak. It is difficult to argue with librarians. For many of us, they were amongst the first authority figures, they are extremely well read, and they are clearly arguing passionately about an issue that few people fully understand. One of the library patrons spoke about how he is comforted by the ability to use Tor for innocent research that might get him flagged by the NSA all the same. Libraries might become the haven of democracy in what has increasingly become a state of constant surveillance. One argument might go along these lines: if we let Homeland Security take over the Internet and give up intellectual freedom, don’t the terrorists win anyway?

Chelsea Kerwin, February 3, 2017

Give a Problem, Take a Problem

February 3, 2017

An article at the Telegraph, “Employees Are Faster and More Creative When Solving Other People’s Problems,” suggests innovative ways to coax creative solutions from workers. Writer Daniel H. Pink describes three experiments, performed by New York University’s Evan Polman and Cornell’s Kyle Emich. The researchers found that, when posed with hypothetical scenarios, participants devised more creative solutions when problems were framed as being someone else’s. But why? Pink writes:

Polman and Emich build upon existing psychological research showing that when we think of situations or individuals that are distant – in space, time, or social connection – we think of them in the abstract. But when those things are close – near us physically, about to happen, or standing beside us – we think about them concretely. Over the years, social scientists have found that abstract thinking leads to greater creativity. That means that if we care about innovation we need to be more abstract and therefore more distant. But in our businesses and our lives, we often do the opposite. We intensify our focus rather than widen our view. We draw closer rather than step back. That’s a mistake, Polman and Emich suggest. ‘That decisions for others are more creative than decisions for the self… should prove of considerable interest to negotiators, managers, product designers, marketers and advertisers, among many others,’ they write.

The article goes on to supply five practical suggestions this research has for business. For one, organizations can recruit independent directors to bring in more objective points of view. Pink also suggests keeping firms loosely structured, and bringing together peers from different fields to exchange ideas. On the individual level, he advises finding a “problem-swapping partner” with whom you can trade perspectives. Finally, workers can create psychological distance between themselves and their projects by imagining they’re helping out someone else.

Pink acknowledges a couple of caveats to this approach. For one, many tasks actually do require concrete thinking and laser focus; it is important to recognize them. Also, the business world is not currently structured to take advantage of this quirk of the human psyche. The article points to the growth of crowd-sourcing techniques as evidence that factor may change. Perhaps… but group think brings its own issues, like the potential for discounting experience and specialized skill sets, for example. To whom shall we turn for a fresh perspective on that problem?

Cynthia Murrell, February 3, 2017

Smart Software Recipe Fiesta

February 2, 2017

I read “140 Machine Learning Formulas.” The listing hits the top 10 most popular algorithms and adds an additional 130. The summary of the formulas is at this link. A happy quack to Rubens Zimbres who compiled the list. A profile of Mr. Zimbres is available at this link. FYI. He’s looking for a new challenge.

Stephen E Arnold, February 2, 2017

Bradley Metrock and the Alexa Conference: Alexa As a Game Changer for Search and Publishing

February 2, 2017

Bradley Metrock, Score Publishing, organized The Alexa Conference held in January 2017. More than 60 attendees shared technical and business insights about Amazon’s voice-search enabled device. The conference recognized the opportunity Amazon’s innovative product represents. Keyword search traditionally has been dependent on a keyboard. Alexa changes the nature of information access. An Alexa owner can talk to a device which is about the size of a can of vegetables. Alexa is poised to nudge the world of information access and applications in new directions.

Bradley Metrock, Score Publishing, organized The Alexa Conference in January 2017. An expanded event is in the works.

After hearing a positive review of the conference, its speakers, and the programming event, I spoke with Mr. Metrock. The full text of the interview appears below:

Thanks for taking the time to speak with me.

Delighted to do it.

What path did you follow to arrive at The Alexa Conference?

A somewhat surprising one. My background is in business, but I’ve always been keenly interested in publishing.  It’s fascinating how the world of publishing has been ripped open by technology, allowing us as a society to shed gatekeepers and hear more stories from more people than we ever would have otherwise. In 2013, when I was in the process of selling a business, I discovered Apple’s iBooks Author software.  I couldn’t understand why more people weren’t talking about it.  It was such a gift: the ability to create next-generation, interactive and multimedia digital books that could be sold on Apple hardware (iPads at first, then later iPhones) all for no cost.  The software was completely free. I formed Score Publishing, published books using iBooks Author, and organized the annual iBooks Author Conference which all sorts of people attend from all over the world.  It’s been fun.

Where does Alexa fit into your interest in publishing books?

I approached Alexa at first from the standpoint of digital content creators: What do they need to get out of this tool?  And out of the Internet of Things, in general?

Do you have an answer to this question about using Alexa as an authoring tool?

No, not yet. My long-term ambition with Alexa is to produce authoring tools for it that allow content creators to leverage their content effectively in an audio-only environment.  Not just audio books, but the creation of voice-enabled applications around published works, from books to white papers and so forth.

What is needed to make it easy for an author or developer to leverage Amazon’s remarkable device and ecosystem?

That’s a good question. The first step toward doing that is learning Alexa myself and incorporating it into what Score Publishing already does.  To that end, we decided to put on the first-ever Alexa Conference. We experienced directly the incredible value in bringing communities of people together on the iBooks Author side of things.  We saw the same exact things with the just-completed Alexa Conference and can’t wait to do it again next January. In fact, we’re already planning it.

What were some of the takeaways for you from The Alexa Conference?

I think Amazon has opened an entirely new world with Alexa that perhaps even they didn’t fully appreciate at first.  Alexa puts voice search in the home. But far from just new ways to buy products or services, Alexa allows every computing interface that exists today to be re-imagined with greater efficiency, while also creating greater accessibility to content than ever before.  My eyes were opened in a big way.

Can you give me an example?

I can try, but it’s hard for me to even begin to explain, being relatively new to the technology and the ideas that Alexa (and IoT in general) bring to the table, but a good place to start is the summary from the first Alexa Conference.  This report gives a taste of the topics and ideas covered.

One of the most interesting events at The Alexa Conference was the programming of an Alexa skill. You called it the Alexathon, right?

Yes, and it was fascinating to watch the participants at work and then experience what they created in less than 24 hours. Developers are red-hot for this technology and are eager to explore its full potential.  They understand these are the early days, just like it was a decade ago with iOS apps for the iPad and iPhone. They see, in my opinion, a combination of opportunity and necessity in being part of it all.

What was the winning Alexa skill?

The winner was Xander Morrison, the Digital Community Coordinator at Sony Music’s Provident Label Group. It took Morrison just 24 hours to create his Nashville Tour Guide as an Alexa skill.

How does Alexa intersect with publishing?

I think the publishing industry doesn’t really understand the implications of the internet of things on its business. Companies like HarperCollins, whom I invited to be part of The Alexa Conference, sent Jolene Barto to the conference. She described how her company built an Alexa skill for one of the company’s most important markets. Her remarks sparked a lively question-and-answer session. HarperColllins seems to be one of the more proactive publishers in the Alexa space at this time.

Is it game over for Google and the other companies offering Alexa-type products and services?

No. I think it is the dawn of the voice enabled application era. Right now, it looks as if Alexa has a clear lead. But the Internet of Things is a very dynamic technology trend. The winner will probably be the company which creates tools.

What do you mean tools?

Software and system that make it easy for digital content to flow into it and be re-purposed in new and exciting ways.

Is this an opportunity for you and Score Publishing?

Yes. As I mentioned earlier, this is an area I want Score Publishing involved in. We may create some of the tools to help bridge the gap for content creators. Many authors and publishers have no interest in learning how to code. Alexa and the competing products do not make it easy for authors and publishers to get their content into the ecosystem all the same.

Google has a competing product and recently updated it. What’s your view of Google with regard to Alexa?

Google is definitely in the fray with Apple Siri and Microsoft Cortana. Also, there are several other less well known competitors. Amazon’s primary advantage is how early Amazon opened up Alexa to third-party development.  Alexa’s other advantages include the sheer marketing reach of Amazon. I learned at the conference that Amazon has done a great job in promoting promoting its hardware, from the Echo, Tap, and Dot. Now the the Amazon Kindle has Alexa baked into the device. Amazon has, in contrast to Apple and Google, demonstrated its willingness to spend significant dollars to advertise both Alexa and Alexa-enabled hardware.

However, Google has something Amazon doesn’t–search data.  And Apple has the dominant mobile device.  So there are advantages these other companies can bring to bear in competing in this space.  I want to point out that Amazon has its shopping data, and its Alexa team will find ways to to leverage its consumer behavior data as Alexa evolves over time.

What are your ideas for The Alexa Conference 2018?

Yes. We will be having another The Alexa Conference in January 2018. The event will be held in Nashville, Tennessee. We want to expand the program. We hope to feature topic and industry-specific sub-tracks as well. If your readers want to sign up, we have Super Early Bird passes available now. There is a limited supply of these. We expect to announce more information in the next month or so.

How can a person inte4reserted in The Alexa Conference and Score Publishing contact you?

We have a number of de-centralized websites such as the iBooks Author Conference, the iBooks Author Universe (a free online learning resource for iBooks Author digital publishing) and now, the Alexa Conference.  Following us on Twitter at @iBAConference and @AlexaConf is a great idea to stay in the know on either technology, and to reach me, people can email me directly at Bradley@AlexaConference.com.

Thank you for taking the time to speak with me.

Stephen E. Arnold, February 2, 2017

Penn State Research Team Uses Big Data to Explore Crime Rates

February 2, 2017

The article on E&T titled Social Media and Taxi Data Improve Crime Pattern Picture delves into a fascinating study that uses big data involving taxi routes and social media location labels from sites like Foursquare to discover a correlation between taxis, locations of interest, and crime. The study was executed by Penn State researchers who are looking for a more useful way to estimate crime rates rather than the traditional approach targeting demographics and geographic data only. The article explains,

The researchers say that the analysis of crime statistics that encompass population, poverty, disadvantage index and ethnic diversity can provide more accurate estimates of crime rates … the team’s approach likens taxi routes to internet hyperlinks, connecting different communities with each other… One surprising discovery is that the data suggests areas with nightclubs tend to experience lower crime rates – at least in Chicago.  The explanation may be that it reflects people’s choices to be there.

This research will be especially useful to city planners interested in how certain spaces are being used, and whether people want to go to those spaces. But the researcher Jessie Li, an assistant professor of information sciences, explained that while the correlation is clear, the underlying cause is not yet known.

Chelsea Kerwin, February 2, 2017

 

Tips for Better Search Results

February 2, 2017

Want to be an expert searcher? Gizbot shares some tips, complete with screenshots, in their brief write-up, “Here are 5 Tricks To Get Better Google Search Results.” Writer Sneha Saha begins:

To get any information about anything is easy. Just type the keywords on the Google Search engine and you are done. Rather you might just get information that is far more than what you would actually need. However, getting more information than you require is also a little annoying. Searching for the accurate information among the numerous links that Google provides you with is surely a tough task. We at GizBot have come up with a list of effective methods to try out to search the most accurate information on Google in just a few clicks.

Here are the five tricks: Search for synonyms using a tilde symbol; Use an asterisk in place of any word you cannot remember; Include “or” when confused between two options; Use “intitle” to find keywords within a title or “inurl” to find keywords within a URL; Narrow results by including a date range in your query. See the post for details on any of these search tips.

Cynthia Murrell, February 2, 2017

Google Semantics Sort of Explained by an SEO Expert

February 1, 2017

I know that figuring out how Google’s relevance ranking works is tough. But why not simplify the entire 15 year ball of wax for those without a grasp of Messrs. Brin and Page, their systems and methods, and the wrapper software glued on the core engine. Keep in mind that it is expensive and time consuming to straighten a bent frame when one’s automobile experiences a solid T bone impact. Google’s technology foundation is that frame, and over the years, it has had some blows, but the old girl keeps on delivering advertising revenue.

I read “Semantic Search for Rookies. How Does Google Search Work” does not provide the obvious answer; to wit:

Well enough for the company to continue to show revenue growth and profits.

The write up takes a different tact toward the winds of relevance. I highlighted this passage:

Google’s semantic algorithm hasn’t developed overnight. It’s a product of continuous work:

  • Knowledge Graph (2012)
  • Hummingbird (2013)
  • RankBrain (2015)
  • Related Questions and Rich Answers (ongoing)

The work began many years before 2012, but that is of no consequence to the SEO whiz explaining how Google search works.

The write up then brings up the idea of semantic and relevance obstacles. I won’t drag issues such as disambiguation, a user’s search history, and Google’s method of dealing with repetitive queries. I won’t comment on Ramanathan Guha’s inventions nor bring up the word in semantics which began when Jeff Dean revealed how many versions of Britney Spears name were in one of Google’s suggested search subsystems.

The way to take advantage of where Google is today boils down to writing an article, a blog post similar to this one you are reading, or any textual information to employing user oriented phrasing and algorithm oriented phrasing. The explanation of these two types of phrasing was too sophisticated for me. I urge you, gentle reader, to consult the source document and learn yourself by sipping from the font of knowledge. (I would have used the phrase “Pierian spring” but that would have forced me to decide whether I was using a bound phrase, semantic oriented phrase, or algorithm oriented phrase. That’s too much work for me.

The write up concludes with these injunctions:

If you wish to create well-optimized content, you shouldn’t focus on text in the traditional sense. Instead, you should focus on words and word formation which Google expects to see. In this day and age, users’ feedback plays a crucial role in determining the importance of content. You will have to cater to both sides. Create content with lots of synonyms and semantically related words incorporated in it. Try to be provocative and readable at the same time.

I don’t want to rain on the SEO poobah’s parade, but there are some issues that this semantic write up does not address; namely, the challenge of rich media. How does one get one’s video indexed in a correct way in YouTube.com, GoogleVideo.com, Vimeo.com, or one of the other video search systems. What about podcasts, still images, Twitter outputs, public Facebook goodies, and social media image sharing sites?

My point is that defining semantics in terms of a particular content type suggests that Google has a limited repertoire of indexing, metatagging, and cross linking methods. Perhaps a quick look at Dr. Guha’s semantic server would shed some light on the topic? Well, maybe not. This is, after all, SEO oriented with semantic and algorithmic phrasing I suppose.

Stephen E Arnold, February 1, 2017

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta