Mold Hits the Internet
October 29, 2013
Links rot quickly on the Internet. How many times have you visited a Web page and clicked on something interesting, only to be met with standard page does not exist template? It is happening more quickly as the Internet gains more prominence in people’s lives and researchers deal with moldy link more than anyone would have thought possible. This poses a problem as Gigaom discusses in the article, “A Web Page Lasts Forever: The Plan To Stop ‘Link Rot’ In Law And Science.” Link rot is present in Web pages a teenager makes to the United States Supreme Court.
The problem has not been taken as seriously as it should, because dead-tree materials are still considered to be the authoritative source because they are static. Web sites, though, can be changed in the blink of an eye.
There is a plan to stop the rot before it eats out the entire Web. Times Higher Education took a look at Perma CC, a source for lawyers that takes a Web page and created an etched-in-stone digital reference.
“It works like this: a scholar (or anyone else) can submit a link to Perma CC, which is managed by a coalition that includes universities, libraries and the Internet Archive. According to Perma CC, the group will create a permanent URL and store the page on its servers and on mirror sites around the world.”
If Web sites from respected academic organizations and the government are supposed to be a solid resource, parts of them will have to remain unchangeable. It will be good for search, because it will mean cutting back on sifting through results and search time. Individuals will have to get used to the idea that books will longer be the definitive source.
Whitney Grace, October 29, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Ironic Programming How To Guide Teaches Programmers How Not To
October 29, 2013
An article from Typical Programmer titled How to Develop Unmaintainable Software is written by a programmer who spends most of his workdays debugging, maintaining and fixing systems. The almost entirely ironical article is a how-to (or how-not-to) code in a way that will make maintenance possible. Tips like “don’t use version control” and “write everything from scratch” are followed with explanations of how such techniques will make it almost impossible for another programmer to address issues in the coding. For example, “Use a bunch of different programming languages and stay cutting-edge” is followed up with,
“Every day HackerNews and Reddit buzz with cool new languages. Try them out on your client’s time… The boundaries between the languages, the incompatible APIs and data formats, the different server configuration requirements are all fun challenges to overcome and post about on StackOverflow…Half-baked caching, aborted Rails and Node.js projects, and especially NoSQL solutions (“It’s more scaleable!”) are my bread and butter.”
We have to wonder if the author had enterprise search systems in mind while writing this hilarious article. In the end, the author explains to his brother and sister programmers that the beauty and functionality of a code is secondary to it being easy to work with. Source control, fewer dependencies, and a testing/staging facility are vital to maintaining a system.
Chelsea Kerwin, October 29, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Search Wizards Speak: Oleg Rogynskyy, Semantria
October 28, 2013
Semantria is a company focused on providing text and sentiment analysis to anyone. The company’s approach is to streamline the analysis of content to that in less than three minutes and for a nominal $1,000, the power of content processing can help answer tough business questions.
The firm’s founder is Oleg Rogynskyy, who has worked at Nstein (now part of Open Text) and Lexalytics. The idea for Semantria blossomed from Mr. Rogynskyy’s insight that text analytics technology was sufficiently mature so that it could be useful to almost any organization or business professionals.
I interviewed Mr. Rogynskyy on October 24, 2013. He told me:
At Semantria, we want to simplify and democratize access to text analytics technology. We want people to be able to get up and running in no time, with a small budget, and actually derive value from our technology. The classic story is you buy a system worth $100k and don’t deploy it.
Semantria focuses on a class of problems that a few years ago would have been outside the reach of many firms. He said:
We make it simple for our clients to solve the following problems: First, some organizations have too much text to read. For example, a Twitter stream or surveys with many responses. Also, there is the need to move quickly and reduce the time to get to market. Many survey results come with an expiry date before they’re irrelevant. Then there is reporting the information. Anyone can use their Excel smarts to build simple/interesting reports and visuals out of unstructured data. But that can take some time, and Semantria accelerates this step. Finally, users need to analyze text with the same impartiality each time. A human might see a glass as half full or half empty, but Semantria will always see a glass with water.
One of the most interesting aspects of Semantria is that the company delivers its solution as a cloud service. Mr. Rogynskyy observed:
We are happily in the cloud, and in the cloud we trust. We have android and iOS software development kits in the works, so whoever wants to talk to our API from mobile devices will be doing it with ease very soon.
You can get more information about Semantria at https://semantria.com.
This interview is one or more than 60 full-text interviews with individuals who are deeply involved in search, content processing, and analytics. You can find the full series at www.arnoldit.com/search-wizards-speak.
Stephen E Arnold, October 28, 2013
Changing the Approach to Enterprise Search: Horse First
October 28, 2013
Pebble Road’s article analyzing search is titled The Curse of Enterprise Search and How to Break It. The curse referred to is created by the misconception that once an enterprise search software has been purchased it will do all the work and the business it is being used for is already handled. The article argues against this lackadaisical approach to search, explaining that search needs to be implemented and designed for a given business with the business’s users in mind. It argues that “search is a negotiation” which is not simply a means to an end but a way to figure out the right question. The article explains with a comparison to camera shopping,
“When you’re searching for a camera, and if you don’t know exactly which one you want, you’re going to start on a search journey, or the negotiation. The journey may take you from locating the right type of cameras ? to comparing them ? to verifying their details. These three modes are not the same. You are seeking different outcomes in each mode. The modes are like layers of meaning. Meaning that will eventually lead you to make a decision.”
Metadata and appropriate interfaces are the answer proposed by the article to designing support search modes that will be useful and productive. The greatest issue is when there is a growing crevasse between the wealth of information and findability. To simplify, don’t put the cart before the horse. Design search for your business and then put in place.
Chelsea Kerwin, October 28, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
In Spite of Fears Over AdWords Search Revenue Continues to Rise
October 28, 2013
The article on Search Engine Watch titled Search Revenues Hit $8.7 Billion in First Half of 2013 shows that in spite concerns over Google’s AdWords adjustments, revenue from search ads continues to rise. The ascension, illustrated with bar and pie charts with data from the last ten years, is clear. The switch to Google’s enhanced campaigns has not caused the reduction that many feared was possible. The article explains,
“Mobile revenue increased by 24 percent from the first half of 2012 ($1.1 billion) compared to the first half of 2013 ($1.3 billion) showing that the mobile ad industry shows no signs of slowing down its growth.”Digital has steadily increased its ability to captivate consumers and then capture the marketing dollars that follow,” said Randall Rothenberg, President and CEO, IAB. “Mobile advertising’s breakneck growth is evidence that marketers are recognizing the tremendous power of smaller screens. Digital video is also on a positive trajectory, delivering avid viewership and strong brand-building opportunities.””
While this is good news for Google and its advertisers, it may leave users in mid-shrug. Isn’t search meant to be about the user’s needs, and not about ad revenue? At this point it is unclear, especially with the Interactive Advertising Bureau’s recent report that clearly demonstrates the health and robustness of the online advertising industry.
Chelsea Kerwin, October 28, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Supercomputer Watson from IBM Put To Work in Hospital
October 28, 2013
The article titled Dr. Watson Will See You Now: IBM’s supercomputer Can Analyze Your Medical History, Point to Likely Diagnoses on TechSpot explores the new work for IBM’s Watson in conjunction with the Cleveland Clinic. The computer is involved in two exciting projects, the first of which is call WatsonPaths and makes use of the supercomputers ability to explore a given medical case from many angles and even use feedback from physicians as well as medical journals and the patient’s records. The article explains,
“The technology allows doctors to sift through information much more efficiently than they could on their own, while Watson’s artificial intelligence ties evidence together to support treatment options. Initially WatsonPaths will be used as a classroom training tool designed to help students and physicians make more informed and accurate decisions, and also to improve their critical reasoning skills. But the real-world potential is clear.”
A doctor interviewed explained that Watson had already noticed things in a few cases that he had missed. House fans might think of that famously grumpy doctor who was such an excellent diagnostician. Dr. Watson might be Dr. House without the moodiness, able to cull from a fantastic collection of information and hone in on the problem at hand. The other work being given to Watson has to do with organizing the unstructured data lying around the hospital such as lab results and clinical notes. Watson has been attempting to form them into a more logical and accessible tool using natural language expertise.
Chelsea Kerwin, October 28, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Amazon and Its Money Losing Model
October 27, 2013
I read “Amazon and the Profitless Business Model Fallacy.” The article was the work of a person who once worked at Amazon, departing in 2004. I assume that some of Amazon’s processes are unchanged, but nine years is a long time, even for an addled 70-year-old goose like me.
The main point of the write up is that many people assume that Amazon is a charity. Amazon, the article points out, is “a classic fixed cost business model.” The company uses the Internet:
to get maximum leverage out of its fixed assets, and once it achieves enough volume of sales, the sum total of profits from all those sales exceed its fixed cost base, and it turns a profit. It already has exceeded this hurdle in its past.
Get your T shirt from Zazzle at http://goo.gl/GTm71a
The article asks:
Does Amazon lose money on sales of some individual items? For sure. The first Kindle ebooks that were priced at $9.99 when Amazon had to pay more than that per copy to publisher were one example. Giant, heavy electronics items that Amazon sometimes ships for free when the shipping cost is clearly non-trivial and cost more than the usual thin margins on such goods are another.
The Bezos brilliance takes this approach:
Amazon has decided to continue to invest to arm itself for a much larger scale of business. If it were purely a software business, its fixed cost investments for this journey would be lower, but the amount of capital required to grow a business that has to ship millions of packages to customers all over the world quickly is something only a handful of companies in the world could even afford.
On the subject of Amazon’s interesting financial report, the article states what is obvious to most analysts who have tried to figure out where the money comes from and where the money goes:
The irony of all this is that while Amazon’s public financial statements make it extremely difficult to parse out its various businesses, it is extremely forthright and honest about its business plans and strategy. It’s the reason Jeff continues to reprint its first ever letter to shareholders from 1997 in its annual report every year. The plan is right there before our eyes, but so many continue to refuse to take it at face value. As a reporter, it must be so boring to parrot the same thing from Jeff and his team year after year, so different narratives must be spun when the overall plan has not changed.
The former employee then offers this observation or is it a threat?
If I were an Amazon competitor, I’d actually regard Amazon’s current run of quarterly losses as a terrifying signal. It means Amazon is arming itself to take the contest to higher ground. The retail game is about to become more, not less, punishing.
Several observations:
Amazon is a giant company with customers, cash, and clout. Those who try to get in its way find out that the Amazon business model is not much less forgiving than Google’s. Google has made little headway in online shopping and that suggests some bright folks have bumped into one or two of Amazon’s pointy parts.
Second, price cutting is a great business tactic. Once the competition is gone, then it is pretty easy to move forward. A number of moguls figured this out decades ago. In today’s regulation-soft environment, commercial enterprisers are functioning more or less like nation states. Different economic rules apply to nation states. Individuals who shop at the company store have fewer options.
Third, Amazon is one of the firms developing a monoculture for its customers. Once one gets into the Prime, one-click, personalization approach to online activities, an old adage kicks in:
Like a soft bed, a bad habit is easy to get into and hard to get out of.
Now my interest—now a hobby, not a job—is search and retrieval. How does the Amazon approach work in search. Amazon offers what I call “corset search.” If you want to get into the darned thing, be prepared to experience some push and pull. If you want a cloud based search system, Amazon is, as I wrote in one of my for-fee columns, a “search lazy Susan.” Just dial up an alternative running on the Amazon system.
Easy. Cheap at least at the outset. Mostly reliable. What more could a vendor want? What more would a user want?
That in my view is the problem with the WalMart approach to technology. Amazon is one of the manifestations of the deep divides that continue to fracture behaviors. I am okay with making my way through an increasingly medieval landscape.
I suppose Wall Street is learning that what look like losses may be something else. We have entered the Dimonization Era. Money is not what it seems perhaps?
Stephen E Arnold, October 27, 2013
Content Analyst Ad with Marcus Zillman
October 27, 2013
I was checking some search related content, and I came across a fascinating advertisement inserted into a Web site operated by Marcus P. Zillman. Here’s the Web page and the ad:
I had a difficult time figuring out if Marcus Zillman – eSolutions architect, international Internet expert, author, keynote speaker and corporate consultant—was Content Analyst or if Content Analyst was Marcus Zillman’s outfit.
I figured out the page, but for a casual researcher, is the juxtaposition misleading? Jarring? Confusing?
I am used to seeing Attivio ads in the most surprisingly contexts. Now I find Content Analyst following the same path. I suppose it generates sales, but I find the trend fascinating. Perhaps I should advertise my dog Tess, pictured on this blog, on Mr. Zillman’s Web site. She would most certainly benefit from association with a person who is an “eSolutions architect, international Internet expert, author, keynote speaker and corporate consultant.”
Woof.
Stephen E Arnold, October 27, 2013
DocumentCloud Uses OpenCalais and Offers Public and Private Functions
October 27, 2013
If you are in need of a relatively painless way to obtain metadata, DocumentCloud might be your solution. Every uploaded document is run through OpenCalais, allowing for user access to widespread information mentioned in them. It simplifies the search for people, places and organizations from your documents and allows you to plot them by dates mentioned in a timeline that can be as specific or general as the user desires.
“Use our document viewer to embed documents on your own website and introduce your audience to the larger paper trail behind your story.
From our catalog, reporters and the public alike can find your documents and follow links back to your reporting. DocumentCloud contains court filings, hearing transcripts, testimony, legislation, reports, memos, meeting minutes, and correspondence. See what’s already in our catalog. Make your documents part of the cloud.”
If you prefer privacy, that is a built-in feature. If you prefer to publish, your documents become a part of the landscape of primary sources in the DocumentCloud catalogue. There is also a highlighting feature that accommodates both public annotations and more private organizational notes. Each note has its own URL, enabling users to show their readers the exact information they need.
Chelsea Kerwin, October 27, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Google Tip and Bill Splitting Calculator
October 27, 2013
We are curious about the reasoning behind the new Google feature revealed in Lifehacker‘s piece, “Google Search Now Calculates Tips, Splits the Check.” Yes, a tip calculator has been added to Google Search. If I remember correctly, calculators have been standard features on mobile phones for some time, and the truly mathematically challenged have a number of apps to choose from for these tasks. Why would we need the capability in our search, too?
That question is beyond the scope of Mihir Patkar’s very brief write-up. He does tell us:
“The next time you get the bill at the end of a meal, just Google it. A new tip calculator can instantly tell you how much you should leave behind, and even lets you split the bill between friends.
“The search string to activate the calculator on the Web, iOS or Android is ‘tip for’ followed by the amount. By default, it’s set to 15% split by one person, but you can adjust those parameters as you need.”
… And the tool is already outdated; in recent years, I’ve been hearing that 20 percent is now the standard. Granted, you can change the percentage easily, as well as quickly adjust the number of diners in your party. The function is straightforward and easy to use—convenient, yes, but hardly a clamored-for feature. Still, I can respect a company that encourages its engineers to play around and then puts the results out there, to be embraced or discarded as the public sees fit.
Cynthia Murrell, October 27, 2013
Sponsored by ArnoldIT.com, developer of Augmentext