Protect Your Productive Time

by Xavier Comments: 0

You only have 24 hours a day…1,440 minutes…86,400 seconds. At first glance, this may seem like a lot.

But it is not.

There is a very limited supply of time, and it can be your friend or your enemy. And you get to choose if you are in control of your time, or if you let others control it.

That is why you need to Protect Your Productive Time.

But why? Why do you need to protect your time? And how do you do it? Let me tell you.

I am a .NET developer who is very passionate about enterprise search, primarily with Apache Solr. And this is great because I’ve spent the last few months of my life building the search API library for one of the Big Four auditing firms. It is a huge project, with hundreds of developers, and my library is only a very small piece in comparison to the rest, but it is a very important piece none-the-less.

And while working here, one of the things that I’ve noticed is how programmers have a tendency to give estimates that are not too accurate. Why does this seem to be a recurring issue?

Well, there are the endless meetings that they don’t account for in their estimates. There’s their own overconfidence — or ego estimates as I like to call them — and a tendency to assume features are simpler than they really are. Also, most people need a “warm up” period in the morning to get into the zone or “flow,” but then they have constant interruptions. If you want to learn more about getting in the productivity zone, I recommend Mihaly Csikszentmihalyi’s book Flow. It’s a classic and a must read.

Going back to estimates, let’s say for example that you estimate a task will take 40 hours, so just in case, you overestimate and say it will take 60 hours. This won’t work. Why? One good reason is that the next 20 hours might be filled with another 10 hours of meetings and interruptions. Just think about it for a minute. How many times a day (or hour) do you get someone that comes to your desk and says “Hey, do you have a minute?” and in some cases without even letting you answer, they sit down or put a laptop on your desk and start explaining what they need.

Say noSo let’s go back to the original question. Are you still going to be able to be done in time?

The answer is usually no.

So what I observed is that some people couldn’t function properly in this environment and couldn’t deliver what they promised when they promised.

Not me.

I delivered what I promised.

On time.

With very little bugs (I can’t lie. I had bugs!).

How?

Was it because of my technical prowess?

Not really. I am a decent developer (and can proudly say a Pluralsight author, Syncfusion author, passionate about Search, speaker, presenter, and trainer among a few other things), but nothing out of the ordinary. There are some developers that I know who are way better than me technically, but I can beat them by a mile in terms of delivery.

Here’s the “secret sauce”:

I did three things:

  • Communicated properly
  • Organized myself
  • Protected my productive time fiercely but kindly

And I was very aware of the first two, but not too much of the third one, which now that I am fully conscious of its importance, I put it as a priority in my day to day work.

When and how did it become very evident to me? Here is the exact moment.

So, there I was one day minding my own business, heads down in Visual Studio, when Brent (one of the testers) approaches me and asks if I have time to review a bug and explain. It was a good moment — that’s what I thought — so I said yes. So he then looks to his side and about 3 cubicles down he signals a thumbs up to Radhika, another very nice tester, to which she smiles and with a face like a kid on a christmas morning says “yaaaaay”.

Depositphotos_55381575_m-2015(1)And then it hit me.

I had been protecting my time by managing all interruptions in a way in which I could focus on working and then schedule time for other people’s needs later. This might sound very simple, but it wasn’t. Everyone has their own deliverables for a specific date and time. And as much as you would like, their deliverables are their top priority, not yours.

And the better you are, the more time people take away from you. I have a friend who peers call “Table of Contents” because he knows everything. How much time do you think he spends working on his deliverables vs. fixing issues on other people’s deliverables? The answer is simple. Most of his time is spent helping others, and he has to work until late at night to finish his stuff, while the developer he helped goes home at 5 pm.

To add insult to injury, top developers usually get more work assigned. It kind-of makes sense from a management perspective. If you have a developer who is very productive, assign them more work and overall the project will benefit from higher productivity. The problem is that this is just short term thinking. As you squeeze out every last possible drop of code from a top notch developer, the end result will usually be a burned out developer or a resignation letter.  Some managers don’t even seem to care. Burned out? Get another contractor or employee, rinse, wash and repeat. But this is a discussion for another post.

Let’s get back to time. Does it sound familiar that others come to your desk asking for your time? Do you have to help them the minute they arrive because “it is urgent”? Or, “C’mon man, it is only 5 minutes”?

What most human beings forget or don’t know is that the human brain requires concentration and focus, especially for activities like coding or design. Context switching breaks your process, which means that you need time to get back into your thought process. I am not an expert, but I heard that from any interruption you need at least 15 minutes to refocus and keep working. So a 1 minute interruption in reality can be 16 minutes that you will never get back. That’s why John Sonmez asks people not to message him while he is focused at work.

In my humble opinion, that is not a good deal for you.

Do you just put on your noise cancelling headphones and a big “Do Not Disturb” sign in front of you and code away?

It is not that simple either.

Unless you are a one man army product development unit, or your work is totally isolated from everyone else’s, then you are a necessary and useful component of your team’s development, and there is an intricate set of dependencies that cannot be ignored.

So what should you do? Let me tell you what I do and a few more tips that I have learned while out on the field.

  1. Turn off chat and rely on an asynchronous communication method. In layman terms, instant messaging applications like Skype, Skype for Business, Lync, Google Chat or similar will keep breaking your focus continuously. I do agree that they can be tremendously beneficial, especially for distributed teams, but in most cases chat can be more of a distractor.
    But don’t uninstall it right away. You can use the “Busy” status so that you are not distracted while focusing on your work, and then during break times you can go back, check the messages and respond. Most applications also let you set a special message, so you could add a phrase like “Focused work time. Will reply soon” to let others know that you are not ignoring. This is very useful, primarily if you are working in an organization where it is common to expect prompt replies to IMs.
  2. Take breaks. Did you know that your brain needs breaks? If you want to learn more about recommended work times and breaks to increase your productivity, I suggest you check out the Pomodoro technique.
  3. Manage your email instead of letting it manage you. This is the other big distractor: email. We all get bombarded by emails on a daily basis. A friend of mine, and the best PM that I have worked with, Ian, had a personal game where he kept a record of how many emails he received on a given day related to the activity that was taking place. Kind of like when you do trend analysis on Big Data, but at a small scale. “Today I received 259 project related emails and it is demo day of version 1.7!”

If you consider how much time it takes for a human being to read 259 emails and respond to them, then he would spend all of his day just reading and replying without having any time to do any of his work.

I assume some of you already did the math, and in his specific case, he got an email almost every 2 minutes in average.

Yes, his was an extreme case. I’ve noticed that the higher you are in the “food chain,” the people have a tendency to add you to the thread. The developer can later claim “Ian was copied,” and that can get him out of trouble (or at least the developer thinks it will).

The same advice goes for email. While email is very important, your objective as a developer is to deliver working code within budget and on time. So please do not forget about email, but don’t make it priority #1 above your deliverables.

Remember, responding emails can keep you from getting fired, but delivering top notch quality code can get you promoted or a raise if you play your cards right.

I recommend you schedule email-dedicated times, and let your peers know so they have a better expectation of when you will be replying.

Cute child begging IT support for helpOkay. So at this point, we have covered two of the biggest distractors, IM and email. But what about the “hey do you have a minute?” cases where someone sits next to you and expects you to jump head first into their need.

Don’t get me wrong. I do it from time to time. It is an impulse. You need something and someone else has an answer. The impulse grows the more desperate you are.

Here is an example from a few months ago. I had already about an hour — that felt like an eternity — trying to modify a dependency injection scenario where I couldn’t get the bloody thing to work, and I knew that both of my team members, Sandeep and Satish, who were sitting quietly in their cubicles across from me, could definitively help.  First of all, I was trying to get it done by myself. Just asking for help at the tiniest issue is not good. As a developer, I get paid to deliver, which involves finding solutions to problems that I run into in the process.

I had it almost sorted out. There was something that I was missing. When I decided it was time to get help, I did interrupt Satish, but I did in what I believe was a nicer way, not being abusive of their time.

I waited a bit for a time when I believed it was okay to ask for help, which was after his phone rang and he hung up. So I approached him and said “Hi Satish, I have a problem with IoC. Would you be able to help me when you get a chance.” I didn’t take my laptop with me either, because that would be applying pressure.

He said: “Yes I can, please give me 15 minutes.” About 20 minutes later, he told me he could help. And indeed there was something I missed, I corrected the error, continued development, and was able to check in the feature a bit later.

I did interrupt his work stream, but in such a way that he commanded his time. I asked for his help, but on his terms, whenever he felt it was a good time.

As I said, there are dependencies that can’t be ignored, but the trick is to be respectful of other’s time and expect the same.

wireless internet connectionIM, email, and one on one help can destroy your productivity, but they are more manageable than the next level of interruptions that can wreak havoc on your productive time: meetings.

Meetings are a double-edged sword. They are an absolute must in some cases, but a complete waste of time in others. And it may be difficult to know in which direction it is going to go. So let’s take it one step at a time with a real world scenario.

I participated in a large Agile project. In such a case, there are multiple teams that have different focused responsibilities. At a general level, you have the business people coming up with the specs, which then flow to the designers, who then pass it along to the architect team, development, testing, and deployment. The process is a bit more complex than this, and it is called Daikibo or Agile at scale, but you get the general idea.

In such an environment, it is of utmost importance that there is fluid communication and that every team down the stream understands fully what needs to be done.

Thus, we have meetings. Granted that specific issues can be followed up via emails, work items or IM’d, but sometimes a meeting is required to — as the cliché says — “get everyone on the same page.”

I give kudos to a well-run and organized meeting that has a very clear objective, which is met, and only the right people are involved.

Does this happen often? No, not really.

Bored panel of judges or interviewersMeetings are abused continuously to levels that make my Outlook cringe in pain! I’ve already blogged a few times about this in Pluralsight, and on my personal blog about how to improve meetings and what hurts your meetings.

Sometimes I’ve felt as if we’ve had meeting inception. In some of the projects I have participated, it feels like we have meetings to plan for meetings! Eternal déjà vu. A glitch in the Matrix?

Increasing your meeting’s productivity is a topic that I intend to cover very soon in a different post, but for now, the general recommendation that I have for you is to minimize attending meetings to only when strictly required, by those that have something of value to add to the meeting, and at times that do not interrupt those who need uninterrupted time to concentrate and work. An example is at the beginning of the day, around lunch time and near the end of the day.

This is a concept known as core working hours. I saw a team down the hall put up a sign that explains this very clearly: “Core working hours are from 9 am to 11:30 am. Please do not interrupt, and expect a delay on our responses.” How great can this be? 2.5 hours of uninterrupted work time. So much productivity behind a closed door!

At first it might be a bit hard to have a part of the team not responding to your emails right away, providing instant replies to your messages or attending back to back meetings. But there are some very clear benefits to this model.

First, makers will concentrate on delivering. Don’t you hate when there is a team member that has a high priority item that is urgent, they don’t finish, and when you ask them why they answer “because I’ve been in XYZ meetings and responding to my emails.” This drives me nuts!

But notice that I am not saying all developers do this. Only a few do, but you get a potential excuse out of the way. For the rest, they get time to concentrate and work.

Another benefit is that other team members, when they know that a person will not be available during a particular period of time, will get used to planning ahead and communicate using email or via comments in work items. It doesn’t matter that much which tracking system you use, but it does make a difference if you use one. In Simple Programmer, we use Trello, but I am a big fan of Jira as a work tracking system. So much so that I have two courses on Pluralsight on Agile, both with Scrum and Kanban.

A core working hours agreement will definitively allow team members to have time dedicated exclusively for working, and this is great. But, you do have to take it one level above, and this is the hardest one: getting each person to be as productive as they can be.

This is no simple feat in any way that you look at it, and it goes beyond what can be done at a team level because it needs to happen at an individual level.

For a person to reach a peak and optimum performance level, they need to be in the right conditions — core working hours being one that I want to convey — and in the right mindset.

How can this be achieved? There are multiple books and methods that can be learned to achieve maximum productivity. For example, Sonmez does it using the Pomodoro technique, which he covered in the Soft Skills book and released recently in a course on his productivity secrets.

But this is the topic of discussion for another post, which I believe will be touched on at length in many more Simple Programmer posts, as well as the book and recently released course.

What I wanted to focus on was how to set the right conditions in a work environment that will allow you to be ready for the next level of maximum productivity.

Stylish young businessman doing a presentationLet’s do a quick recap of what I presented:

  •  Time can be your friend, or it can be your enemy.
  •  You can manage your time, or you can let others manage your time.
  •  To be able to create (i.e., write code in case of a developer) you need blocks of uninterrupted time.
  •  After each interruption, your brain needs some time to refocus and get in the “flow” to produce optimum  results.
  •  IM is a constant interruption, try to minimize the use of it and set your status to “Busy” with a friendly message when working.
  •  Schedule email.
  •  Don’t interrupt people with the expectation they will help you right away; instead, ask politely for help on the schedule of the person that you are seeking help from.
  •  Meetings can be good, but they can be evil as well; thus, invite only the people that are absolutely required with a clearly defined agenda and meeting objective.
  •  A core working hours agreement can set the right conditions for levels of optimum performance to flourish.
  •  Time can be your friend, or it can be your enemy.
  •  With the right conditions, going to the next level is up to you.

And there you go, that is my advice. Protect Your Productive Time!

 

Thanks to John Sonmez for the opportunity to write in his blog. If you want to supercharge your carreer, follow him!

This post originally appeared in: https://simpleprogrammer.com/protect-productive-time/

SolrNet Release 0.5.1: Fix Spellcheck Parser Issue

by Xavier Comments: 0

SolrNet, the C# client for Apache Solr, has a new release: 0.5.1. The current release aims to include a breaking change with the latest versions of Solr 4.x in which multiple collations are returned by Solr. I am currently working on getting it to Nuget

This is the release:

SolrNet Release 0.5.1: Fix Spellcheck Parser Issue

Let me show you quickly with an example, here is how a single collation was returned before:
<response>
<result numFound=”1″ start=”0″>
<doc>
<str name=”Key”>224fbdc1-12df-4520-9fbe-dd91f916eba1</str>
</doc>
</result>
<lst name=”spellcheck”>
<lst name=”suggestions”>
<lst name=”hell”>
<int name=”numFound”>1</int>
<int name=”startOffset”>0</int>
<int name=”endOffset”>4</int>
<arr name=”suggestion”>
<str>dell</str>
</arr>
</lst>
<lst name=”ultrashar”>
<int name=”numFound”>1</int>
<int name=”startOffset”>5</int>
<int name=”endOffset”>14</int>
<arr name=”suggestion”>
<str>ultrasharp</str>
</arr>
</lst>
<str name=”collation”>dell ultrasharp</str>
</lst>
</lst>
</response>

And then with later versions of Solr 4.x multiple collations were returned:

<response>
 <result name="response" numFound="0" start="0"></result>
 <lst name="spellcheck">
 <lst name="suggestions">
 <lst name="produtc">
 <int name="numFound">1</int>
 <int name="startOffset">0</int>
 <int name="endOffset">7</int>
 <arr name="suggestion">
 <str>product</str>
 </arr>
 </lst>
 <lst name="collation">
 <str name="collationQuery">product</str>
 <int name="hits">1000</int>
 <lst name="misspellingsAndCorrections">
 <str name="produtc">product</str>
 </lst>
 </lst>
 </lst>
 </lst>
</response>

The detail was that SolrNet would raise an issue because the numFound node was not found. Well, this issue was fixed now.

This is just the first release that I do on SolrNet since I was granted permission to provide new releases. I am merely getting up to speed to work my way through the backlog of improvements and including support for newer releases of Solr.

If you have any questions, don’t hesitate to contact me via this blog or @xmorera in Twitter.

 

How to Install TeamViewer on CentOS (useful on Cloudera QuickStart VM)

by Xavier Comments: 0

The other day I needed to finish a task I had in one of my servers and needed to remote into one of my Cloudera QuickStart VMs to run a test while on a trip. So I installed TeamViewer to access it. Steps are simple:

# Click on Download TeamViewer link for RedHat, CentOS, Fedora, SUSE to get the rpm package from the downloads page

https://www.teamviewer.com/en/download/linux/

Open terminal and go to downloads directory

sudo yum localinstall teamviewer_12.0.71510.i686.rpm

And then start with

teamviewer

The Art of Creating Applications That Have Search

by Xavier Comments: 0

In my Pluralsight trainings, Getting Started with Enterprise Search using Apache Solr and Implementing Search in .NET Applications, one of the things that I make quite a bit of emphasis is on how important search is, yet it is one of the most misunderstood functions of IT and development in general. In this post I will show you an example of how a potentially good app is a pretty bad app mainly because of its search capabilities.

It is so much the case that in Twitter Pluralsight selected this phrase to tweet about the release of my course as you can see here:

searchiseverywhere

But now let’s get to the sample. Here’s the scenario:

Problem: Life is busy. No time to go to the supermarket

Solution: use your grocery store’s web site to purchase your food and it gets delivered home the next day. Charming idea, did not work with Webvan, but it seems to be doing quite well for Amazon and in my home town one of the major supermarkets is doing it in a more controlled way with a good delivery service, all for $10. Not too scalable, but for a MVP it is ok. (Read Lean Startup if you don’t know what MVP is)

It may work or maybe not mainly because of a really bad user experience, but let me get to the point. UX is important! Never forget it!

You get to the app in https://www.automercado.co.cr/aam/showMain.do and they have mainly 4 sections as you can see here

auto

And here is what they are for:
– On the left they have a directory style organized by aisle. Grouping kind of works in my opinion if you are not too sure of what you want, but it is terribly slow and inefficient. They lose cookie points for this.

2014-07-02_0638

– Then in the middle they have a section where they display the products. This is very standard so it kind of goes through, however they lose cookie points again for having products without pictures or with very weird stretching. They are a supermarket, and a big one, so I am sure they can send a guy with an iPhone to take a quick picture.

2014-07-02_0637

– The cart has a problem which is that they do not actually display the product name, only the description. Who thought of this? Not even something as simple as a tooltip!

2014-07-02_0640

And then here is the deal breaker for me: BAD SEARCH! As mentioned in the post, search is one of the most misunderstood functionalities in IT. A lot of people make huge mistakes because search can be done with a database, which it can, but the end results sucks! And it did suck here.

Let me show you this. I want to look for “jabon dial” which means “Dial Soap”. So I just type “Jabon Dial”. Should work, right? It doesn’t! Look at the message: “No results found…”. Also I hate the CAPS. There may be 1 technical reason I can think of but it is pretty dumb.

2014-07-02_0646

But why? If you look closely there are 27 types of “Jabon Dial”, type only Dial

2014-07-02_0649

The problem lies here:
– The person that implemented this application had no knowledge of how search works, which is normal as search is pretty misunderstood.
– But humans don’t do search like engineers want. Having the user do a search exactly like the engineer wants is just lazy and ineffective.
– So engineers who created this probably went for a simple exact match in a database search
– This is a terrible user experience. I can bet the farm that Amazon would have closed its doors in the 1990s if they had such a bad search

How to fix it? Well, go learn how to use a search engine. And that’s why I created my course, Getting Started With Enterprise Search Using Apache Solr: http://www.pluralsight.com/training/Courses/TableOfContents/enterprise-search-using-apache-solr

Tip of the Day: Disable Screen Blanking in Linux CentOS while Installing Cloudera CDH

by Xavier Comments: 0

Something that really annoys me, especially when connecting remotely is how the terminal blanks when installing Cloudera Manager in Linx CentOS

Well, there is a very simple fix, simply run the following command and the terminal will not go black

sudo setterm -blank 0

The Day We Started to Outgrow Relational Databases

by Xavier Comments: 0

Look around you. Look closer. Pay more attention.  What do you see?
When I look around me I can see activity trackers, digital cameras, smart watches, interconnected devices, virtual reality gadgets, wearable technology, smart elevators, energy saving light systems, intelligent traffic lights, smart cars with over the air updates and that can gather data on your driving habits, intelligent houses, eco friendly buildings and more.
All of these generate massive amounts of data.  But let’s hold that thought for a minute.
Now this is just what’s happening around you.  What do you have in your pocket or in your hand right now?
Most likely a smart phone. It is your portal to the digital world and even though it became second nature – pretty much everyone walks around with a phone in their hand now a days – it is a relatively new phenonem.
It is highly likely that you use your smartphone constantly to check Facebook, Twitter, Instagram or search the web using Google among a few other applications. This generates humongous amounts of data.
Let’s throw out a few numbers just to put it in perspective. Facebook has 1.6 billion users- yes, that is with a B – millions of them who log in every day to upload millions of pictures, add comments, like posts and perform many actions. And every action has an impact but as the amount of data grows, it gets harder to determine what that impact needs to be.
Then we have Twitter, which may be a tad smaller albeit still plenty of data by any definition. But the beauty of Twitter is not just the human interactions, but instead what can be extracted from the data.
And Google… well… what can I say? Try indexing the internet and then we can talk about it. Just do it, invite me over and I will buy you a coffee while you tell me how it went.
How much data do you think is generated daily by all these applications? But besides the applications, remember how I mentioned many devices that also create mountains of data? This means that besides human generated data, we also have machine generated data.
A Big Data World
The world that has changed in unimaginable ways. Together, human and machine generated data, bring us into an information explosion era the likes of which the world has never seen. You are living a digital revolution and you can consider yourself lucky for being part of it.
And tweets, posts, likes, pictures and stats are very nice. But that is just the tip of the iceberg of what’s to come.
There are many other applications that require analyzing those massive amounts of data to help reduce costs, detect fraud, and many other potential use cases that help drive innovation for all mankind. All these are scenarios that help increase profits, decrease costs, innovate or help stop the bad guys are nice. But let’s take it up one notch.
There are some people trying to make a difference like hospitals that are working to cure cancer by analyzing DNA records, comparing them and find ways to save human lives. Imagine if one of those lives they saved was your son, your daughter, your wife or your parents.
The world has changed around us. We now live in a world of Big Data.
Getting Insights
But data by itself is just data and as I mentioned something needs to be done with this data to get insights.  How can this be achieved?
Well, let’s first rewind a few years and think how this was done before. A while back, if you had a “massive” amount of data you went to your prefered vendor and wrote them a huge check for an equivalent machine. Then you wrote another very big check to your favorite database provider to run in this machine and built an application that could consume the data, process it and give you the answers you needed. Also, you usually had to limit the amount of data as you were constrained by the limits of your big box so you had to throw data away.
Outgrowing Relational Databases
But there were times when a big box was not enough. For example, what if you wanted to index the entire internet? There was no box big enough for this.
Also, there were a lot more scalability constraints like performance. If you had 1 terabyte and it took 1 hour to process and then you add another terabyte, well it will probably take almost twice the amount of time or even more.
And this was a problem that needed a whole new way to be solved. It all happened when Google published a paper circa the early 2000s where they explained how they invented a way to solve this problem using GFS and MapReduce.
And then magic happened. Two Yahoo employees, Doug Cutting and Mike Cafarella, read the paper and they had to solve the same problem so they got inspired and created Hadoop!
Welcome to Distributed Computing at Its Best
Hadoop took a different approach to solve any Big Data problem. Instead of a big box, it relied on creating clusters of many smaller and way cheaper computers, also known as commodity hardware.
The data is then distributed among all these computers and then processed locally. In hindsight it sounds so common sense, but instead of taking the data to where it is processed, it takes the computation to the data. Each individual node in the cluster has its own copy of the data and does all the computation locally.
Agreed, a big box is probably more reliable than a bunch of commodity servers, so probably a few of them might fail during processing. This is not a problem as Hadoop has data redundancy so if one server fails, there is a copy of the data replicated in another server that can pick up the work.
And so we have distributed, resilient, dependable and efficient Big Data systems that are helping us change the world.
Data drives the modern world. But who drives the data? That is what Cloudera is here for.
Ask Bigger Questions!

Big Data… Big Deal? Big Hype?

by Xavier Comments: 0

Big Data

Big deal?

Big hype?

Or big change in our world?

I think that the answer can be all of the above. “Hype” you might be thinking? Well, here is the deal. Our world has changed in unimaginable ways. The amount of information created daily is reaching levels that just a few years ago would’ve been considered science fiction or even plain old crazy.

Lots and Lots of Data
To make it even more interesting, a lot of it is unstructured data. Which can be kind of a problem if we think about it, because the success of relational databases has taught a lot of us to think in a columnar and relational way.

And this is not bad… at all. It is nice to have all your data and metadata organized neatly. You can use select, join, where, group by and more to get what you need.

But the success of relational databases can also create a blind spot for many. Just a few days ago I was talking with the VP and cofounder of a company related to migrations and artificial intelligence software whose company has faced success (as well as a few failures or learning experiences) in several world class projects. They had lots of data that they obtain from their automated code conversion tools and what are they doing? They are normalizing it into a database.

I don’t think it is a bad approach, however it is not the one that I would take. Long story short, I would store the logs as is in their raw format and then use any of the available projects to analyse it in multiple ways, looking for key points, failures, trends and more. But what you do with the data is the topic of another post or a Pluralsight training. Let’s go back to our main point.

Mountains of data is being generated daily and the amount will just continue to -grow- explode.

Unstructured Data “Just Happens” 
If you had to structure all your data, do you imagine what the cost would be? Just go ask your manager for an Oracle system and some servers to process all of your web server logs to put them in tables. The cost would be exorbitant.

And beside cost, sometimes you may not know the structure of your data. And that is one of the beautiful parts of Big Data. You can just store your logs in raw format and later come back and do your work, modelling your data in different ways. And what if you have too much data and the process is taking longer than expected?

Well, just add a few more servers and get the job done in parallel. Hadoop runs in commodity hardware, thus you can get many relatively inexpensive machines to work together and process your data according to your needs.

The Cloud and the Bar
And even better, remember “the cloud”! A few years ago if you were a startup and needed beefy power, you would need a lot of upfront cash to cover expenses. Now with AWS and Azure we have the possibility of turning a few virtual machines, get a cluster up and running, crunch the data, get the result, turn them off and only pay for the time you use.

And this change has lowered the entry bar for innovation. Now many brilliant ideas can be tested or theories can be analysed at a much lower cost, benefiting all man kind. For example, it is possible to run analysis on medical treatments to help cure cancer or many other diseases. Sometimes answers to hard questions lie right there in the data, they just need to be discovered.

Hype or Go Figure This Hadoop Thing Out
But what about hype? Let me make this clear, I don’t think Big Data is hype. I do think that there is a lot of hype around it and even though we are able to do great things with Big Data, the greater public does not yet fully understand what can be done and how so I have taken a personal mission to help developers and the public in general understand Big Data (and Search)

So then it is time to ask ourselves this question:

What Are My Choices for Getting Started with Big Data?

Big Data Links

by Xavier Comments: 0

Here is a collection of some interesting or fun articles that I have found on Big Data
5 Reasons to Move to Big Data (and 1 Reason Why It Won’t Be Easy): gives an easy to understand set of selling points on why to adopt Big Data, but making it clear some of the issues you might face.

– The Most Practical Big Data Use Cases Of 2016: covers some interesting use cases of Big Data. Remember, Big Data is sexy!

Why ‘Big Data’ Means Nothing Without ‘Little Data’: Little Data is regular performance metrics.

– Why Big Data is the new competitive advantage: provides good points on how Big Data can help give you an edge.

Big Data: What is it and Why it Matters: goes straight to the point to explain the basics.

Big Data Analytics: What is it and Why it Matters: explains what is Big Data Analytics.

SolrNet vs. SolrExpress

by Xavier Comments: 1

I have been working with Solr for a while, mainly from the .NET world and I basically love it. I use SolrNet which I think it is a very mature and stable library. I was asked today if I have ever used SolrExpress and if I recommend it over SolrNet.

The short answer is no, I have not used it. Therefore I can’t give a facts based recommendation, but looking over the source code of both libraries it is my opinion that SolrNet is still more complete. So I still believe SolrNet to be a more sensible choice.

It is worth mentioning that is a biased point of view,  as I have used SolrNet multiple times and it really has made my life a lot easier.

Having said that, besides using it several times, I have authored a few things around Solr and SolrNet and used it extensively. It works fine and I know it pretty well. It basically gets the job done, it is pretty mature and almost complete (pending SolrCloud and a few minor things like a breaking change on collation).

Some of the things I created

I created a Solr training for Pluralsight

2016-08-18_1009

https://www.pluralsight.com/courses/enterprise-search-using-apache-solr

Getting Started with Enterprise Search Using Apache Solr …
www.pluralsight.com
Search is one of the most misunderstood functionalities in the IT industry. Apache Solr brings high quality Enterprise Search to the masses.

And a SolrNet training for Pluralsight

https://www.pluralsight.com/courses/implementing-search-dotnet-applications

2016-08-18_1005

I wrote a book for a company called SyncFusion for their Succinctly Series for Solr and SolrNet

https://www.syncfusion.com/resources/techportal/ebooks/apachesolr

2016-08-18_1002

I’ve also done internal trainings, presentations and webcasts on Solr + SolrNet
http://www.meetup.com/Atlanta-Net-User-Group/events/222161640/

2016-08-18_1011

Learn How to Add Search to .NET with Solr & SolrNet …
www.meetup.com
Search is a functionality that most people take for granted while at the same time it is deeply misunderstood and usually poorly implemented. .NET

SolrNet does not have yet support for SolrCloud in the main repository, but there is one fork that already uses it but our current project does not use forks, only the main repository. If that is not a blocker for your customer, go ahead or like in our case, just use a load balancer for querying and a call to zookeeper api to get leader for indexing.

Hope this helps.

Incremental Search and How Every Day You Learn Something New

by Xavier Comments: 0

Yesterday I was coming back from the beautiful mountains of Monteverde in Costa Rica, feeling full of energy after a relaxing weekend. Monteverde is one of the most beautiful places I’ve been. Newsweek has declared Monteverde the world’s #14 Place to Remember Before it Disappears.”

Anyway, on the drive back I stop and decide to check my email, as usual, and I see a contact form from my blog so I decide to check it. This is what I found, a note from Robert Stevens: Read more!