Quote to Note: Big Data Getting Even Biggerest Super Fastly

August 11, 2014

I love quotes about Big Data. “Big” is relative. You have heard a doting patent ask a toddler, “How big are you?” The toddler puts up his or her arms and says, “So big.” Yep, big at a couple of years old and 30 inches tall.

If You Think Big Data’s Big Now, Just Wait” contains a quote attributed to a Big Data company awash in millions in funding money. Here’s the item I flagged for my Quote to Note file:

“The promise of big data has ushered in an era of data intelligence. From machine data to human thought streams, we are now collecting more data each day, so much that 90% of the data in the world today has been created in the last two years alone. In fact, every day, we create 2.5 quintillion bytes of data — by some estimates that’s one new Google every four days, and the rate is only increasing…

I like the 2.5 quintillion bytes of data.

I am confident that Helion, IBM’s brain chip, and Google’s sprawling system can make data manageable. Well, more correctly, fancy systems will give the appearance of making quintillions of whatevers yield actionable intelligence.

If you do the Samuel Taylor Coleridge thing and enter into a willing suspension of disbelief, Big Data is just another opportunity.

How do today’s mobile equipped MBAs make decisions? A Google search, ask someone, or guess? I suggest you consider how you make decisions. How often do you have an appetite for SPSS style number crunching or a desire to see what’s new from the folks at Moscow State University.

Yep, data intelligence for the tiny percentage of the one percent who paid attention in statistics class. This is a type of saucisson I enjoy so much. Will this information find its way into a Schubmehl-like report about a knowledge quotient? For sure I think.

Stephen E Arnold, August 11, 2014

Watson Goes To Medical School

August 11, 2014

Watson has been trying his hand at becoming a gourmet chef, but now the smart machine plans to support healthcare providers with medical knowledge. According to Technology Review, “IBM Aims To Make Medical Expertise A Commodity” and support organizations by giving them a cheaper way to improve their expertise. The push to make medical knowledge a commodity comes from the rising worry that cancer rates are going to soar in the next decade and there will not be enough medical professionals to go around.

IBM wants to prove that Watson can be used beyond cooking and answering trivia questions. The computer has been deployed in two beta tests with hopes it will improve service quality and take the paper work load off healthcare professionals:

Lynda Chin, a professor of genomic medicine at MD Anderson and a leader of the center’s Watson project, anticipates that in the future that kind of product will be highly valued by general oncologists and regional cancer practices. ‘Physicians are too burdened on paperwork and squeezed on revenue to keep up with the latest literature,’ she says. That limits the care physicians can deliver, and it has financial consequences: ‘If you can’t make a decision based on your own knowledge, you have to refer the patient out, and that’s going to hurt your bottom line.’”

Dr. Watson has yet to earn money for IBM, whose revenue has decreased the past two years due to cloud deployment. The betas are only being used for research and development and they are demonstrating that Watson has trouble deciphering medical jargon.

IBM is trying to earn a buck on the changing medical industry. The article ends on how IBM will try to monopolize Watson for healthcare, but it is disappointing that the patients come off as an afterthought. Making money comes first, while saving lives is second.

Whitney Grace, August 11, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Bye-Bye Tumblr Blog, Hello Reddit Blog

August 11, 2014

Blogs are popular ways to get live news and first hand accounts of events. Tumblr is one of the more popular platforms to launch a blog, but Reddit, the popular Internet content site, is in the beta phase of creating their own blogging platform.

The Next Web article, “Reddit Live Is Official, Lets Anyone Create Their Own Live Blog” is a new blogging platform that specializes in live blogging. Live blogging is the practice of commenting on a live event, such as a TV show or sport competition, while it is happening. The new platform was in a beta period, where only Reddit employees were able to create live streams. Now anyone can create a live blog. Live blog posts will update in real-time and they support embedded tweets, links, and YouTube videos.

While Reddit’s new platform is already popular, it does come with a warning label:

“Reddit has proven its ability to effectively crowd source information online, but it comes with inherent difficulties. The site came under fire in April 2013 for its reaction to the Boston Marathon bombings, with what was later described as “online witch hunts” and “dangerous speculation.” Reddit will need to be extra careful if it wants to ensure such an incident doesn’t occur again.”

Stan Lee and Steve Ditko’s phrase saying “with great power comes great responsibility” rings true for Reddit. The live blogging platform could become the new way news circulates around the globe. The only problem is that fact checking does not take place as quickly as speculation.

Whitney Grace, August 11, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Google Investing Billions to Get Billions. Obviously.

August 10, 2014

I read “5 Google Projects That Will Pave the Future.” The title confused me. I think the author wanted me to think that Google was paving the way to the future. What I interpreted the title to mean is that Google wants to cover the future with Google’s own digital macadam.

The point of the write up is that Google is doing some big, speculative projects. Bell Labs used to do this, but without the fanfare. But there is a public relations and marketing battle underway among the giant companies that seek to monopolize markets if not the “future.”

The write up mentions Project Loon (the big balloons that will deliver Internet access to folks without the benefit of non balloon methods), Calico (this is the live forever stuff that recently experienced the departure of a nanotech self assembler due to some differing opinions), robots (mobile, smart gizmos that entrances the folks at DARPA), self driving cars (more time to surf the Web and consume ads in a vehicle), and DeepMind (more of the artificial intelligence hoo hah).

Good stuff for those who consumed science fiction, Star Trek, and Star Wars. The only problem is that those billions have to come from someplace. That’s a point overlooked in the Loon plus four article.

That’s why you will want to read “Dear Google, I Am Writing an Open Letter from the Search Wilderness.” The main point of this write up is that Google is investing considerable time and effort to generate revenue from its traffic. I suppose this is obvious to most Mad Ave types, but it appears to have come as a surprise to the author of this letter.

The passage I highlighted was:

It is now a directory of large public or soon to be public companies, who dominate every inch of our screens. I am sure we have all walked down many high streets with all the same chain stores and brands. This is Google Search today across many of the world’s markets. Gone is the opportunity to explore and unearth gems and engage with individuals on the world’s largest stage where a digital high street could have a thousand specialist shops with ease. There are sophisticated ways and means to search and uncover the unusual, the new and the people who care and services that actually work. But directionally, “Search” heads to the money instantly!

Note the phrase “heads to the money instantly.” Here at Beyond Search, I am indifferent to traffic, PageRank, speed with which Google indexes the content, and anything other than the topics that catch my attention. The reason is that I am retired and this blog is a way to fill time between walking my dogs and napping.

For the author of the letter, Google’s focus on money is, it appears, destroying his business. Well, that’s what happens when one builds on a free service. Personally I think Google can destroy as many businesses as necessary to generate money for:

  • Projects like Loon
  • Flying around to cut deals for Google Glass
  • Replace people like Babak Amirparviz (aka Parviz, Parvis, and Amir Parviz)
  • Paying for Google health care so some Googlers can spend three months in Stanford’s medical facilities
  • Paying for jets
  • Using Steve Ballmer’s running into the wall method to crack into money making television
  • Buying companies to amplify usage behavior capabilities.

These initiatives cost money.

I find the complaining in this open letter like King Lear’s howling in the storm:

Lets face it though, with so few slots its a money page now, not a joy to visit any longer!

Wow. Harsh. Google results are not objective and fun.

Here’s an even more subversive view of Google’s search system which cost billions to develop:

So quite interestingly the guest who has relied on Google to sort his problems and assist in his own search has been guided by Google’s very own algorithm to a hotel or holiday home that is not necessarily the best for him, at the best price or with the best amenities who often stands no chance of communicating with the accommodation provider until he has booked! Pay up and hope for the best as the business has no product knowledge, location familiarity, in depth business knowledge, controls or quality control in place!

And here is a thought that I have never entertained:

The consumer may just look elsewhere and try other search engines, as all he may see are the high street brands, the ones he was overjoyed to have dodged when the web was in its infancy and when Google Search revealed a whole myriad of exciting new places, people and products!

Well maybe.

The point is that Google is essentially operating as a country. The country’s productivity has to go up. In order to pump up the revenue, the altruism and baloney like “do no evil” or “make all the world’s information accessible” are shibboleths for monetization.

What I find interesting is that Google’s business model is not a Google invention. The idea for pay to play came from GoTo.com (Overture.com). Yahoo owned this company. Google was inspired by Overture’s revenue methods and Yahoo settled for some money in a mild dispute about the use of this monetization method.

You can believe in Loon. I believe in what Google does after 15 years: Sell ads. Last time I checked, the folks with the money can buy lots of ads. Folks who cannot afford to advertise need to find their future elsewhere.

If some Web sites get zero traffic, well, get on that social media tsunami. Google has a mission to deliver revenues and profits every 90 days. That mission does not necessarily coincide with that of others. If you are unfamiliar with this Google process, find and MBA and ask.

Stephen E Arnold, August 10, 2014

Actuate: Some Metrics

August 9, 2014

The blurring of search, business intelligence, and number crouching makes it difficult to figure out exactly what a company licenses. In the case of Actuate, there are some crystal clear products and services, and there are some which weave across boundaries.

For some, Actuate means an open source business data-reporting project launched by the Eclipse Foundation in 2004. You can download Eclipse BIRT here.

Actuate released BIRT 4.4, a commercial product, in July 2014. The company issued a news release titled “Actuate Announces BIRT Analytics 4.4 for Even Easier and Faster Big Data Advanced Analytics for Business Professionals.” Actuate employs the jargon that electrifies those who ride the data analytics bandwagon; for example:

BIRT Analytics 4.4 is a sophisticated, end-to-end software solution that allows users to extract maximum value from Big Data, in the form of visual statistical insights that enable sharper commercial decision-making, and greater customer responsiveness, providing organizations a powerful competitive edge. The built-in, columnar database engine loads at an unrivalled speed of up to 60 GB/hour. With BIRT Analytics 4.4, users are able to explore up to 6 billion records in less than a second, and perform advanced analytics on a million records in under a minute. Business analysts and business users can get to the exact insight they need in seconds rather than days or weeks – freeing IT and data scientists to work on projects that require their expertise. A new user interface (UI) and instructions further increase productivity for business users and administrators.

The news release should pump some life into Actuate’s revenues which were $135 million for the year ending 12-31-2013. In May 2014, the company reported a quarterly decrease in net income and a decrease in net operating cash flow. Emerging Growth’s report “Actuate Corporation Offers Underwhelming Performance” stated:

The revenue fell significantly faster than the industry average of 6 percent. Compared to the same quarter last year, Actuate revenues fell by 31 percent.

Is Actuate struggling with some of the same market forces that bedevil search and content processing vendors? Announcements and feature upgrades have to translate into sustainable revenue; otherwise, stakeholders will become increasingly grumpy.

Stephen E Arnold, August 9, 2014

When a Search Vendor Says Fuzzy

August 9, 2014

Short honk: You may have heard a search or content processing vendor use the word “fuzzy” or “fuzzify” to describe a smart system. If you have, you may want to know what may be behind the jargon curtain. For fuzzy insights (intentional word choice, gentle reader) check out “Binary Fuzzing Strategies: What Works, What Doesn’t.” If you are not sure what “works” means, feel free to contact the saucisson at the IDC-type consulting firms. Illumination is only a payment away. If you choose another route, get your math T shirt on and check out https://code.google.com/p/american-fuzzy-lop/wiki/StatusScreen.

Stephen E Arnold, August 9, 2014

IBM Buzz Equals Revenues: The Breakthrough Assumption

August 8, 2014

I am no wizard of finance. I have kept track of money for my Cub Scout troop. I do understand this chart from Google Finance:

image

The blue column shows that revenue is going nowhere, maybe even trending down. The red line shows IBM’s profit margin which is flat. And the gold bar presents IBM’s operating income. Notice that it is flat. The flat lines are achieved by cost cutting, selling off dead end businesses, and introducing innovations like offices an employee has to sign up to use.

I have focused on IBM Watson because I am interested in search and content processing. To eliminate confusion, I don’t work in this field. It is a hobby. This is a fact that perplexes the public relations professionals who want me to write about their client. Yep, that works really well. If you read my comments in this blog, you will know that I take a slightly more skeptical approach to the search and content processing saucisson that flows across my desk here in Harrod’s Creek, Kentucky. If you are a fan of ground up mystery meat, you can check out my most recent saucisson reveal here.

What caught my attention today was not a report about IBM landing a major deal. Nope. I did not notice a story about IBM’s Jeopardy champ smashing Autonomy’s single quarter revenues prior to the company’s sale to Hewlett Packard. Nope. I did not read about a billion dollar licensing deal for IBM’s semantic technology to a mobile phone giant. Nope.

What I learned about was an IBM chip that does not use Von Neumann architecture. Now this is good news. In my intelligence community lecture about the computational limitations of today’s content processing systems, the culprit is Von Neumann’s approach to computing. In a nutshell, some numerical recipes cannot be calculated because of pesky hurdles like Big O or P=NP.

IBM, if I believe the flood of remarkably similar articles, has kicked Von Neumann to the side of the road with SyNapse. I do like the quirky capitalization and the association of a neural synapse in a brain and IBM’s innovation.

Check out “IBM Chip Processes Data Similar to the Way Your Brain Does.” You can find almost the same story in the New York Times, the Wall Street Journal, and other “real” journalistic constructs. (IBM’s public relations firm certainly delivered some serious content marketing in my opinion.)

Here’s a quote I noted from the Technology Review article:

The new chip is not yet a product, but it is powerful enough to work on real-world problems. In a demonstration at IBM’s Almaden research center, MIT Technology Review saw one recognize cars, people, and bicycles in video of a road intersection. A nearby laptop that had been programmed to do the same task processed the footage 100 times slower than real time, and it consumed 100,000 times as much power as the IBM chip. IBM researchers are now experimenting with connecting multiple SyNapse chips together, and they hope to build a supercomputer using thousands.

There is a glimpse of the future in this passage and a reminder that quite a bit of work remains; for example, “they [IBM researchers] hope to build a supercomputer…”

Hope.

In addition to low power consumption, the “breakthrough” gives IBM an opportunity to “create a library of ready-made blocks of code to make the process easier.”

Who is fabricating the chip? According to IBM’s statement in “New IBM SyNapse Chip Could Open Era of Vast Neural Networks,” the 5.4 billion transistor chip is Samsung. The IBM statement says:

The chip was fabricated using Samsung’s 28nm process technology that has a dense on-chip memory and low-leakage transistors.

That seems like a great idea. I wonder if any of the Samsung engineers learned anything from the exercise. Probably not. The dust up between Samsung and some of its other “partners” are probably fictional. Since IBM seems to be all thumbs when it comes to fabbing chips, the Samsung step may be a “we had no options” action.

IBM’s breakthrough is not just a chip. Nope. It seems to be:

a component of a complete end-to-end vertically integrated ecosystem spanning a chip simulator, neuroscience data, supercomputing, neuron specification, programming paradigm, algorithms and applications, and prototype design models. The ecosystem supports all aspects of the programming cycle from design through development, debugging, and deployment.

To speed along understanding of what IBM has figured out:

IBM has designed a novel teaching curriculum for universities, customers, partners, and IBM employees.

I assume this part of IBM’s master plan for generating more revenue and profit.

Several thoughts crossed my mind as I worked through some of the “real” news outfits’ reports about the SyNapse:

  1. How long will it be before IBM’s customers, partners, and employees create a product that generates revenue?
  2. Will the SyNapse eliminate the lengthy training and configuration processes for IBM Watson?
  3. Will Samsung and other customers, partners, and RIFfed IBM employees stand on the shoulders of the giants in IBM’s research centers and make money before IBM can gets its aircraft carrier fleet turned in a new direction?

I don’t want to rain on the very noisy parade, but I think neurosynaptic technology will require considerable time, money, effort, and coding. But if it boosts IBM’s stock price and creates sales opportunities, SyNapse will have played its part in making the revenue line and the net profit line perform a Cobra and blast upward like an SU 35.

While I wait for Watson, I will use Bing, Google, and Yandex for search. Limited and old fashioned technology that sort of works. Watson running on SyNapse, an interesting lab project that has produced some massive content marketing zing.

Stephen E Arnold, August 8, 2014

LanceList Sounds More Credible Than Craigslist

August 8, 2014

It is hard knock life pounding the pavement, searching for a legitimate freelance IT job that pays well. There is always Craigslist and Elance.com, but Craigslist is not always reputable and Elance.com is only one place to look for jobs. Lifehacker has come to rescue reporting about a new Web site that will help freelance IT workers searching for a job: “LanceList Collects Together Freelance Job Boards.”

LanceList offers something unique for freelancers:

“LanceList is a curated list of freelance jobs boards for creative work, hardware, marketing, writing, and development. Essentially, it’s a great starting place if you’re new to freelancing or if you’re just looking to branch out and get more work. The list is curated by just one person, but it includes all the major freelance sites alongside a few of the more niche ones.”

What the big catcher is that it includes postings from major job boards all in one place. Freelancers can save time and put their energy into applying for jobs instead of having to search. Having all data in one place these days is a big pull to attract visitors. Aggregating data from Web sites is becoming a not-so-new, yet popular trend. If freelancing is not your gig, maybe coming up with new apps that specialize in pulling data from multiple sources is the new way to go.

Whitney Grace, August 08, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Search Be Random

August 8, 2014

The early days of Internet search always yielded a myriad of search results. No two searches were ever alike and sponsored ads never made it to the top, because they were not around much. It was especially fun, because you go to see more personal, less corporate content. Now search results are so cluttered, albeit more accurate results and with paid links. Given that humans are also creatures of habit, we tend not to stray far from out safe surfing paths and shockingly the Internet can become a boring place.

Makeuseof.com wrote “Discover Interesting Content With Five Ways To Randomize The Internet” and it points out some neat ways to discover new information. It highlights basic ways: Random Wikipedia, random Google Street View, random YouTube, and random Reddit. For all of these be prepared to get sucked into Internet linkage, videos, and photos for hours if you use any of these tools of randomness. Random Website takes users to any random Web site in its generator.

“How often do you find yourself on the Internet looking at the same boring pages? You know there is something out there but you don’t know where to look. Trust me, how bad could it be?”

What is fun is being taken to dark pages of Web 1.0 or a Web site that serves no purpose other than hosting a single word on a single page.

A lot of Internet content is weird, as seen by using these tools, but some of it can lead you to new thoughts and interests. If you need a metaphor, imagine the Internet is like an encyclopedia, except the entries never end and contain all the information about a topic instead of a short summary.

Whitney Grace, August 08, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Does Anything Matter Other Than the Interface?

August 7, 2014

I read what I thought was a remarkable public relations story. You will want to check the write up out for two reasons. First, it demonstrates how content marketing converts an assertion into what a company believes will generate business. And, second, it exemplifies how a fix can address complex issues in information access. You may, like Archimedes, exclaim, “I have found it.”

The title and subtitle of the “news” are:

NewLane’s Eureka! Search Discovery Platform Provides Self-Servicing Configurable User Interface with No Software Development. Eureka! Delivers Outstanding Results in the Cloud, Hybrid Environments, and On Premises Applications.

My reaction was, “What?”

The guts of the NewLane “search discovery platform” is explained this way:

Eureka! was developed from the ground up as a platform to capture all the commonalities of what a search app is and allows for the easy customization of what a company’s search app specifically needs.

I am confused. I navigated to the company’s Web site and learned:

Eureka! empowers key users to configure and automatically generate business applications for fast answers to new question that they face every day. http://bit.ly/V0E8pI

The Web site explains:

Need a solution that provides a unified view of available information housed in multiple locations and formats? Finding it hard to sort among documents, intranet and wiki pages, and available reporting data? Create a tailored view of available information that can be grouped by source, information type or other factors. Now in a unified, organized view you can search for a project name and see results for related documents from multiple libraries, wiki pages from collaboration sites, and the profiles of project team members from your company’s people directory or social platform.

“Unified information access” is a buzzword used by Attivio and PolySpot, among other search vendors. The Eureka! approach seems to be an interface tool for “key users.”

Here’s the Eureka technology block diagram:

image

Notice that Eureka! has connectors to access the indexes in Solr, the Google Search Appliance, Google Site Search, and a relational database. The content that these indexing and search systems can access include Documentum, Microsoft SharePoint, OpenText LiveLink, IBM FileNet, files shares, databases (presumably NoSQL and XML data management systems as well), and content in “the cloud.”

For me the diagram makes clear that NewLane’s Eureka is an interface tool. A “key user” can create an interface to access content of interest to him or her. I think there are quite a few people who do not care where data come from or what academic nit picking went on to present information. The focus is on something a harried professional like an MBA who has to make a decision “now” needs some information.

image

Archimedes allegedly jumped from his bath, ran into the street, and shouted “Eureka.” He reacted, I learned from a lousy math teacher, that he had a mathematical insight about displacement. The teacher did not tell me that Archimedes was killed because he was working on a math problem and ignored a Roman soldier’s command to quit calculating. Image source: http://blocs.xtec.cat/sucdecocu/category/va-de-cientifics/

I find interfaces a bit like my wife’s questions about the color of paint to use for walls. She shows me antique ivory and then parchment. For me, both are white. But for her, the distinctions are really important. She knows nothing about paint chemistry, paint cost, and application time. She is into the superficial impact the color has for her. To me, the colors colors are indistinguishable. I want to know about durability, how many preparation steps the painter must go through between brands, and the cost of getting the room painted off white.

Interfaces for “key users” work like this in my experience. The integrity of the underlying data, the freshness of the indexes, the numerical recipes used to prioritize the information in a report are niggling details of zero interest to many system users. An answer—any answer—may be good enough.

Eureka! makes it easier to create interfaces. My view is that a layer on top of connectors, on top of indexing and content processing systems, on top of wildly diverse content is interesting. However, I see the interfaces as a type of paint. The walls look good but the underlying structure may be deeply flawed. The interface my wife uses for her walls does not address the fact that the wallboard has to be replaced BEFORE she paints again. When I explain this to her when she wants to repaint the garage walls, she says, “Why can’t we just paint it again?” I don’t know about you, but I usually roll over, particularly if it is a rental property.

Now what does the content marketing-like “news” story tell me about Eureka!

I found this statement yellow highlight worthy:

Seth Earley, CEO of Earley and Associates, describes the current global search environment this way, “What many executives don’t realize is that search tools and technologies have advanced but need to be adapted to the specific information needed by the enterprise and by different types of employees accomplishing their tasks. The key is context. Doing this across the enterprise quickly and efficiently is the Holy Grail. Developing new classes of cloud-based search applications are an essential component for achieving outstanding results.”

Yep, context is important. My hunch is that the context of the underlying information is more important. Mr. Earley, who sponsored an IDC study by an “expert” named Dave Schubmehl on what I call information saucisson, is an expert on the quasi academic “knowledge quotient” jargon. He, in this quote, seems to be talking about a person in shipping or a business development professional being able to use Eureka! to get the interface that puts needed information front and center. I think that shipping departments use dedicated systems who data typically does not find their way into enterprise information access systems. I also think that business development people use Google, whatever is close at hand, and enterprise tools if there is time. When time is short, concise reports can be helpful. But what if the data on which the reports are based are incorrect, stale, incomplete, or just wrong? Well, that is not a question germane to a person focused on the “Holy Grail.”

I also noted this statement from Paul Carney, president and founder of NewLane:

The full functionality of Eureka! enables understaffed and overworked IT departments to address the immediate search requirements as their companies navigate the choppy waters of lessening their dependence on enterprise and proprietary software installations while moving critical business applications to the Cloud. Our ability to work within all their existing systems and transparently find content that is being migrated to the Cloud is saving time, reducing costs and delivering immediate business value.

The point is similar to what Google has used to sell licenses for its Google Search Appliance. Traditional information technology departments can be disintermediated.

If you want to know more about FastLane, navigate to www.fastlane.com. Keep a bathrobe handy if you review the Web site relaxing in a pool or hot tube. Like Archimedes, you may have an insight and jump from the water and run through the streets to tell others about your insight.

Stephen E Arnold, August 7, 2014

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta