Welcome to Big Data TV – Or The One That Started It All

by Xavier Comments: 0

Hello and welcome, I am Xavier Morera and I am very passionate about helping developers understand enterprise search and Big Data.

And today, I welcome you to the first post of the Big Data Inc Series (which will soon be joined with Big Data TV).

So, you might be wondering… what is the Big Data Inc Series?  Easy. It is a series of bite size posts that explain enterprise search and Big Data.

What is my objective? At a high level, each post will take between 5 to 7 minutes, and will provide an overview of one particular topic – and only one – to give you enough information to understand what is the purpose of a particular platform, language, project or anything else that touches enterprise search and Big Data

Why am I doing this? First of all, I am really passionate about search and Big Data… like a kid on Christmas day. I do have to agree that I have my preferred platforms, languages, and projects.  However, it does not hurt to have an idea of what each one is about.

Also, why are the posts so short? Well, I could go on and on for hours – believe me, or at least my friends who say that a 45 minute presentation for me is just like warming up – but the point is that I want to be very concise, straight to the point, and give you an overall idea. The Big Data Series is not meant to be tutorials. For trainings I have several courses at Pluralsight which include topics like Spark, Cloudera CDH, Solr, Hue, Hive, JSON, code profiling and more – as well as having done and helped on trainings for Cloudera, Microsoft/HP/Intel.

I will cover a topic, give you a general idea, and let you decide if this is a technology that could be useful in your toolbelt. In many cases, I will point you in the direction of where to go learn more or I will tell you a story or two of how these technologies are used in real life.

So please join me on this journey with the Big Data Series. In our next post, we will talk about how Big Data started, with Hadoop. Also don’t forget to subscribe to be notified of new released posts, videos, like and share. Also, you can follow the links below in the description.

And as we Costa Ricans say, pura vida!

 

The Art of Creating Applications That Have Search

by Xavier Comments: 0

In my Pluralsight trainings, Getting Started with Enterprise Search using Apache Solr and Implementing Search in .NET Applications, one of the things that I make quite a bit of emphasis is on how important search is, yet it is one of the most misunderstood functions of IT and development in general. In this post I will show you an example of how a potentially good app is a pretty bad app mainly because of its search capabilities.

It is so much the case that in Twitter Pluralsight selected this phrase to tweet about the release of my course as you can see here:

searchiseverywhere

But now let’s get to the sample. Here’s the scenario:

Problem: Life is busy. No time to go to the supermarket

Solution: use your grocery store’s web site to purchase your food and it gets delivered home the next day. Charming idea, did not work with Webvan, but it seems to be doing quite well for Amazon and in my home town one of the major supermarkets is doing it in a more controlled way with a good delivery service, all for $10. Not too scalable, but for a MVP it is ok. (Read Lean Startup if you don’t know what MVP is)

It may work or maybe not mainly because of a really bad user experience, but let me get to the point. UX is important! Never forget it!

You get to the app in https://www.automercado.co.cr/aam/showMain.do and they have mainly 4 sections as you can see here

auto

And here is what they are for:
– On the left they have a directory style organized by aisle. Grouping kind of works in my opinion if you are not too sure of what you want, but it is terribly slow and inefficient. They lose cookie points for this.

2014-07-02_0638

– Then in the middle they have a section where they display the products. This is very standard so it kind of goes through, however they lose cookie points again for having products without pictures or with very weird stretching. They are a supermarket, and a big one, so I am sure they can send a guy with an iPhone to take a quick picture.

2014-07-02_0637

– The cart has a problem which is that they do not actually display the product name, only the description. Who thought of this? Not even something as simple as a tooltip!

2014-07-02_0640

And then here is the deal breaker for me: BAD SEARCH! As mentioned in the post, search is one of the most misunderstood functionalities in IT. A lot of people make huge mistakes because search can be done with a database, which it can, but the end results sucks! And it did suck here.

Let me show you this. I want to look for “jabon dial” which means “Dial Soap”. So I just type “Jabon Dial”. Should work, right? It doesn’t! Look at the message: “No results found…”. Also I hate the CAPS. There may be 1 technical reason I can think of but it is pretty dumb.

2014-07-02_0646

But why? If you look closely there are 27 types of “Jabon Dial”, type only Dial

2014-07-02_0649

The problem lies here:
– The person that implemented this application had no knowledge of how search works, which is normal as search is pretty misunderstood.
– But humans don’t do search like engineers want. Having the user do a search exactly like the engineer wants is just lazy and ineffective.
– So engineers who created this probably went for a simple exact match in a database search
– This is a terrible user experience. I can bet the farm that Amazon would have closed its doors in the 1990s if they had such a bad search

How to fix it? Well, go learn how to use a search engine. And that’s why I created my course, Getting Started With Enterprise Search Using Apache Solr: http://www.pluralsight.com/training/Courses/TableOfContents/enterprise-search-using-apache-solr

The Importance of Networking and Good People

by Xavier Comments: 0

Being an entrepreneur is hard. I have several things at once (yes, mistake) but I am moving forward. One of the key areas where I put a good amount of effort is creating Pluralsight trainings. And one of my trainings, where I put in a huge amount of work is “Getting Started with Enterprise Search Using Apache Solr”, which takes a dev with 0 experience in Solr and a bit of .Net and in 3.4 hours teaches him or her how to build a working POC style project with Solr and a .NET MVC UI.

You can watch the training here: pluralsight.com/training/courses/TableOfContents?courseName=enterprise-search-using-apache-solr

Getting to the point, Pluralsight recently acquired CodeSchool and to celebrate they opened their library for 72 hours for free. So I announced in a couple of Linkedin groups that the course on Solr will be free for this time in case they want to take advantage of the offer.

Huge surprise did I get when I see a newsletter from Solr-Start (www.solr-start.com) announcing this. It turns out that Alexandre Rafalovitch, a well known Solr popularizer and author saw my notice and blasted off an email to his crowd.

It feels great when a good author shares your news over a newsletter! I wouldn’t even asked him to do this but he did it on his own and for that I really have to thank him.

And by the way, if you are just getting started with Solr, his book Instant Apache Solr for Indexing Data How-to is an excellent resource that can help you understand how to index data. It has a lot of great tips and examples. I got it from amazon a while back and it has helped me greatly. 100% recommended!

You can get it here:

https://www.packtpub.com/big-data-and-business-intelligence/instant-apache-solr-indexing-data-how-instant

Or in Amazon.com, and as you can see I bought it 1 year ago.

Free Sample + Collection Code Files       Instant Apache Solr for Indexing Data How-to

 

 

 

 

Best Practice and Development Tip: Don’t Reinvent the Wheel in C#

by Xavier Comments: 0

This is just a quick tip and development best practice based on a few things I’ve found while fixing bugs in an application. It is not just a quick tip on how to get the extension of a file, but instead it is about not reinventing the wheel, thinking about all possibilities and outcomes when you are programming and in general doing things right.

The idea is that whenever you have a problem to solve, for example get the extension for a given file you should find the appropriate framework function instead of trying to solve it on your own. Someone definitively already spent a lot of time creating a function that tests many potential scenarios.

Here is what I found:

Don't reinvent the wheel

What is the problem? That for any file that is included with multiple “.” Then as you can see the extension is extracted incorrectly.

How should I handle this? Welll, if you are wondering “oh look for the first “.” but from right to left!”

Hmmmm yes…maybe… but no sale.

Instead you should use the appropriate framework libraries. Read this: http://msdn.microsoft.com/en-us/library/system.io.path.getextension(v=vs.110).aspx

Path.GetExtension Method

.NET Framework 4.5

7 out of 10 rated this helpful – Rate this topic

Returns the extension of the specified path string.

Namespace:  System.IO
Assembly:  mscorlib (in mscorlib.dll)

Syntax

C#

public static string GetExtension(        string path)

Parameters

path

Type: System.String

The path string from which to get the extension.

Return Value

Type: System.String
The extension of the specified path (including the period “.”), or null, or String.Empty. If path is null, GetExtension returns null. If path does not have extension information, GetExtension returns String.Empty.

And try to do the same always. Think about all possibilities and when possible try to find out if you are not reinventing the wheel.