Is Amazon Building the Next Big Thing?
April 25, 2012
The Network Thinkers (TNT) blog believes it has discovered “The Next Big Thing:” social via Amazon. The write up posits that the information Amazon gathers from Kindle readers, which goes beyond “customers who bought this item also bought. . .” to include highlights and notes folks have made in their e-copies. The article asserts:
“It is what we specifically find interesting and useful in those books that reveals deep similarities between people — the hi-lites, bookmarks and the notes will be the connectors. Our choices reveal who we are, and who we are like! Today, Amazon introduces you to similar books. Tomorrow, they will introduce you to similar readers.”
Intriguing. What makes this post more interesting, though, are the comments; ideas presented as new strike some as covering old ground. “Anonymous” notes:
“Eh, not really that under the radar? Kindle.amazon has been recommending readers with similar profiles for quite some time. But more people take photos or have jobs than read books, so the scale will be less?”
Though his or her voice is tenuous, Anonymous makes a good point: book readers seem to be a dwindling breed (sigh), so the Kindleverse is unlikely to rival Facebook or LinkedIn anytime soon. Myspace, maybe.
Cynthia Murrell, April 25, 2012
Sponsored by PolySpot
Autonomy Private Cloud, Big and Growing
April 19, 2012
Now that’s big data: “Autonomy Puts 50 Petabytes in Cloud,” announces Gadget. Autonomy already held the record for the world’s largest private cloud, and this milestone stretches their lead. The write up reports:
“The Autonomy private cloud now manages more than 50 petabytes of web content, video, email and multimedia data on 6,500 servers in 14 data centers around the world. Fifty petabytes is equal to 665 years of HD-TV video, or 1 billion four-drawer file cabinets filled with text.”
Pausing to consider the amount. . . . Yep, that’s a lot. The article asserts:
“The continued dramatic growth of Autonomy’s private cloud is the result of a unique approach to cloud computing. Powered by Autonomy’s Intelligent Data Operating Layer (IDOL), the private cloud automatically recognises concepts and patterns in the billions of structured and unstructured data files it ingests and indexes every day.”
Most vendors talk about big data. Autonomy, it seems to us, does it. The company offers a full range of cloud-based solutions that use Autonomy’s IDOL to tame mind boggling amounts of unstructured data. Now owned by HP, the company was founded in 1996. It has offices around the world and helps over 65,000 customers derive meaning from their overwhelming collections of information.
Cynthia Murrell, April 19, 2012
Sponsored by TheTrendPoint
Sponsored by Pandia.com
Want to Slip Around the Systems Department? Go Cloudy
April 18, 2012
Silicon.com presents an interesting view of why cloud computing is of interest to executives in “Business Execs ‘Use Cloud Computing to Dodge the CIO’.” Um, shouldn’t different departments be trying to work together instead of flouting one another? Perhaps that’s too naïve. Or too logical.
Writer Steve Ranger points to research from Forrester which found that two thirds of CIOs now believe their business sees cloud computing as a way to circumvent IT. Two Thirds! The article tells us:
“Cloud computing is being used as a way for businesses to dodge the IT department and get services delivered more quickly. But as well as giving the CIO sleepless nights, this attempt to side-step the IT department is causing additional cost and complexity along the way.”
That’s one way to manage: go around the “problem”. Another silver bullet play by the uninformed. These executives may, however, find that they are shooting themselves in the feet. Datamation finds “Cloud Computing: Bigger and Better—But Still Flawed.” In that piece, Robert McGarvey explores the way cloud computing has failed to deliver, and the way it is playing out differently than previously conceived. He writes:
“Cloud computing, so far, has not lived up to expectations — it’s slow, it has troubles housing huge enterprise critical data, and it is perceived as insecure.”
We recall that Google at one time was working to find end user and information technology conduits into the enterprise. We are not sure how well this works. Google, after 13 years in business, generates only about five percent of its revenue from sources unrelated to online advertising. If Amazon jumps into the enterprise sector with more than cloud search, we think the developer angle Amazon takes may be more welcome than the “go around” approach.
Cynthia Murrell, April 18, 2012
Sponsored by Pandia.com
CloudStack and OpenStack Compete for the Cloud
April 15, 2012
We need an “open” scorecard that goes beyond this blog and OpenSearchNews, our coverage of open source search systems, and TheTrendPoint, our information service about open source analytics.
ComputerWorld recently reported on an innovative new open source cloud project that IBM and Red Hat have recently joined, in the article, “Report: IMB, Red Hat to Join OpenStack.”
According to the article, the project known as OpenStack was started two years ago by Rackspace and NASA and has since grown to include more than 150 companies and almost two thousands developers. Other backers include: HP, Dell and Internap.
OpenStack is a global collaboration of developers and cloud computing technologists producing an open source cloud computing platform for public and private clouds.The project aims to deliver solutions for all types of clouds by being simple to implement, massively scalable, and feature rich. The technology consists of a series of interrelated projects delivering various components for a cloud infrastructure solution.
As with most projects of this sort, OpenStack already has some well backed competition. The article states:
“Citrix this week announced it would create a competitor to OpenStack by giving its CloudStack platform a license from the Apache Software Foundation. Meanwhile, OpenStack celebrated a milestone on Thursday related to the fifth release of its software, code named Essex.”
This is definitely some interesting open source maneuvering. We’re interested to see the solutions that CloudStack and OpenStack come out with. As important, what does open mean?
Jasmine Ashton, April 15, 2012
Sponsored by Pandia.com
Yandex Expands to Storage
April 13, 2012
Yandex is getting frisky or ??????.
Yandex, the leading Russian search engine responsible for 63 percent of search traffic in Russia in, has launched a beta version of a free service that allows users to store their files online, according to the Taume.com article “Yandex Launches Free Storage Service.”
The service, known as Yandex.Disk, allows web users to access their stored files from any internet-enabled device and upload up to 10 GB of files in any format be it documents, photos, music, or videos. Yandex.Disk also automatically saves all attachments to the emails in their Yandex.Mail account which has additional storage space for this purpose.
The article states:
“Yandex.Disk supports synchronization between multiple devices, for example, a text file saved on a home computer can be opened and edited on a laptop at work. Yandex.Disk is accessible via a web interface, as well as via Windows or Mac OS GUI client. Owners of iOS- or Android-based smartphones can also use the service via the Yandex.Mail app.”
Could this be the beginning of a tech Cold War? If Yandex continues integrating its products outside the U.S. and becomes the default search engine for Google rivals such as Apple the American search giant may have finally met its match. What happens if Yandex gets the idea of Amazonizing itself?
Jasmine Ashton, April 13, 2012
Sponsored by Pandia.com
Amazon CloudSearch Demo Available
April 13, 2012
Search Technologies, a search and content processing consulting firm, has a public demonstration of the new Amazon CloudSearch system. The corpus for the demo is Wikipedia. You can check out the demonstration at this link. I ran a query for one of my favorite math guys “Euler”, and this is the result the system displayed. I noticed a latency of about three seconds, but your mileage may vary. With the announcement of the Amazon CloudSearch service, clicking and testing were probably keeping the AWS infrastructure busy.
Several features have been tapped by the Search Technologies’ engineers:
- Facets or hot link to Article Type, Category, and Most Recent Author are display
- The snippet averages about 70 words
- The Categories have a slight highlight.
For more information about Amazon CloudSearch, you can start with Amazon’s own information pages.
For information about Search Technologies, navigate to www.searchtechnologies.com.
Stephen E Arnold, April 12, 2012
Sponsored by Pandia.com
Amazon CloudSearch
April 12, 2012
Well, some of the folks working to bolt a search and retrieval system into “big data”, mobile apps, and cloud vendors’ systems are trying to figure out what to do about Jeff Bezos. The head of Amazon has taken time from his space flight activities to disrupt the world of cloud-based search and retrieval.
The announcements were handled in Amazon’s typical mode. Those who were privy to the new service, which is based on A9 with what looks like some open source goodness inside, had to keep quite. Then Amazon published a chunk of Web pages about the service. You can find most of the basics in this CloudSearch documentation collection.
There are two general interest type blog posts. You may want to check out Dr. Werner Vogels’ “Expanding the Cloud—Introducing Amazon CloudSearch” and the AWS Blog story “Amazon CloudSearch—Start Searching in One Hour for Less Than $100 / Month.”
The system is the “old” A9 search service which received some early life support from Udi Manber, now a Googler. But the features and functions referenced in the documentation suggest that additional work has been done to make facets, snippets, highlighting, and graphic features take advantage of some open source goodness. However, Amazon takes some care to make sure that the provider of the open source goodness is tough to grab. The best example of this method is Amazon’s handling of the Android operating system for the Kindle Fire. Beneath the sluggish interface of the Kindle Fire beats the heart of Android 2.x. Even the Amazon app store runs certain apps, not all of them. The approach works and keeps many of Amazon’s secrets from turning up in Gawker or trendy Silicon Valley blogs. Amazon secrecy is not quite Apple grade, but Amazon is familiar with the orchard.
According to Expanding the Cloud – Introducing Amazon CloudSearch:
Developers set up a Search Domain — a set of resources in AWS that will serve as the home for one collection of data. Developers then access their domain through two HTTP-based endpoints: a document upload endpoint and a query endpoint. As developers send documents to the upload endpoint they are quickly incorporated into the searchable index and become searchable.
Developers can upload data either through the AWS console, from the command-line tools, or by sending their own HTTP POST requests to the upload endpoint.
There are three features that make it easy to configure and customize the search results to meet exactly the needs of the application.
Filtering: Conceptually, this is using a match in a document field to restrict the match set. For example, if documents have a “color” field, you can filter the matches for the color “red”.
Ranking: Search has at least two major phases: matching and ranking. The query specifies which documents match, generating a match set. After that, scores are computed (or direct sort criterion is applied) for each of the matching documents to rank them best to worst. Amazon CloudSearch provides the ability to have customized ranking functions to fine tune the search results.
Faceting: Faceting allows you to categorize your search results into refinements on which the user can further search. For example, a user might search for ‘umbrellas’, and facets allow you to group the results by price, such as $0-$10, $10-$20, $20-$40, etc. Amazon CloudSearch also allows for result counts to be included in facets, so that each refinement has a count of the number of documents in that group. The example could then be: $0-$10 (4 items), $10-$20 (123 items), $20-$40 (57 items), etc.
For more information on the different configuration possibilities visit the Amazon CloudSearch detail page.
Automatic Scaling: Amazon CloudSearch is itself built on AWS, which enables it to handle scale.
Okay, automatic. This sounds like the standard line from every cloud vendor with knowledge of sharding, distributed computing, and work allocation. We noted that the system supports Boolean logic and math operations. That’s good news and long overdue from Amazon.
Our take on Amazon CloudSearch is that Amazon has introduced a service which will allow developers to get out of the business of figuring out how to bolt a third party search solution to their Amazon content. For organizations looking for a silver bullet to kill the on premises search systems, Amazon has taken a quick step into the search disco.
Will Amazon’s CloudSearch become a viable alternative for on premises search? Will Amazon’s new service put additional pressure on the big enterprise companies like Hewlett Packard and Oracle. Both of these outfits have spent big money buying ageing findability solutions. What about Microsoft with its ubiquitous search solutions included with SharePoint? What happens to mid tier vendors like Lexmark Isys or start ups like DataStax and its Enterprise 2.0 service?
We don’t know. What we do know is that Amazon, unlike Google and Facebook, has found a way to enter a service space without looking much like a head on competitor to any other company. Google has not moved too far from its on premises Google Search Appliance. Facebook continues to dither when it comes to full-on search. Amazon’s challenge will be getting its costs under control and finding a way to placate the Wall Street MBAs. Search on Amazon is, in our opinion, a service which is in dire need of improvement.
Perhaps the CloudSearch will impact the way Amazon.com’s book search works? I am still struggling to find a way to NOT out books which are not yet available. I find the method of coping with titles on the iPad 3 Kindle reading app almost unusable.
Can Amazon do better? Yes. Will CloudSearch be that important leap forward? I don’t know. But I am watching, and I have a hunch that other search vendors, partners, and integrators are checking out this most recent blast from Bezos Land.
Stephen E Arnold, April 12, 2012
Sponsored by Pandia.com
Desktop Search Moves to the Cloud
April 12, 2012
Tech Crunch’s Colleen Taylor recently reported on a new app called Found, that lets you find and access your documents whether they are on your computer or online, in the article “Found Makes Searching for Files Anywhere Super Simple (and Really Sick).”
According to the article, the San Francisco based app aims to organize the mess of documents that are relevant to our work and personal lives. Found currently plugs into Gmail, Google Documents, and Dropbox and the company says that it will be adding additional integrations in the near future.
Taylor states:
“Once you install it on your computer, looking for things in Found quickly becomes second nature — and you quickly start to wonder about how much time you wasted searching for things before you had it. Of course, the real key will be seeing how snappy the Found app is once more people are using it after the public launch later this spring — nowadays, an app is only as good as it can scale. But at the moment, Found is looking very like a very promising tool for the those of us who are a bit less organized with our files than we’d like to be.”
While the app won’t be released to the public until mid-May, you can see how Found works via an embedded video in Taylor’s article. The notion of a cloud service indexing content on a local machine may give some users pause. We prefer to use behind-the-firewall solutions. Even cloud back ups are solutions which don’t address the issues we face.
Jasmine Ashton, April 12, 2012
Sponsored by Pandia.com
MapR Expands Hadoop Connectors
April 4, 2012
This MapR move signals more options for Hadoop users. Talkin’ Cloud reports, “MapR Announces Broad Data Connection Options for Hadoop.” Writer Brian Taylor specifies:
The data connections, according to the press release, enable a ‘wide range of data ingress and egress alternatives for customers,’ including direct file-based access using standard tools; direct database connectivity; Hadoop-specific connectors via Sqoop, Flume and Hive; and access to popular data warehouses and applications using custom connectors.
Sqoop, Flume, and Hive are all open source projects at Apache; the first two are still in incubation.
MapR is getting a hand on this project from tech providers Pentaho and Talend, who will supply direct integration with MapR Distribution. In addition, Tableau Software is helping to promote the new data connection options.
Co-founded by Xoogler M.C. Srivas, MapR has built on the work of developers behind the open source Hadoop, making it “more reliable, more affordable, more manageable and significantly easier to use.” MapR boasts that its innovations help its customers get the most out of the big data phenomenon.
Watch for our forthcoming open source analytics blog. Roll out is April 9, 2012.
Cynthia Murrell, March 29, 2012
Sponsored by Pandia.com
Amazon Web Services Explained
April 2, 2012
You can make the Amazon cloud work for you if you attend to the type of information we found here; Digg presents “Cracking the Cloud: an Amazon Web Services Primer.” The article notes:
It’s safe to say that Amazon Web Services (AWS) has become synonymous with cloud computing; it’s the platform on which some of the Internet’s most popular sites and services are built. But just as cloud computing is used as a simplistic catchall term for a variety of online services, the same can be said for AWS—there’s a lot more going on behind the scenes than you might think.
Writer Matthew Braga goes on to elaborate in detail on the workings of AWS. He defines and explains Elastic Cloud Compute (EC2), Elastic Load Balance (ELB), Elastic Block Storage (EBS), and Simple Storage Service (S3). Braga emphasizes that these are just the core components, and that there are many other features of AWS that he doesn’t have space to cover here. What he does describe, though, is quite useful information for the tech reader.
Cynthia Murrell, April 2, 2012
Sponsored by Pandia.com