Music for reading(spotify)

Perhaps I’m coo-coo, banana pants crazy but I’m perfectly willing to spend an hour automating a repetitive 10-minute task. At my previous job (Seimitsu), we called this being “creatively lazy”. “Randal,” I hear you say, “isn’t that a waste of time?” Well, that depends on your answer to this question: “Will anyone ever have to perform that 10-minute task again?” If the answer is “yes” then it’s not a waste of time in the slightest to the poor soul asked to spend that 10 minutes God know how many times until they quit, retire, or die.

Let’s take a recent example from my current job (the names have been changed to protect the blah, blah, blah… Please don’t fire me!). For marketing purposes, we needed to know the names, addresses, and phone numbers of all of the medical practices, physicians, physician assistants, and nurse practitioners in and around given areas who might refer to our office. So did we forsake some unfortunate person to the drudgery of googling (or searching through phone books!) and manually compiling this information? By the Power of Greyskull, no! Off I went to solve this little problem with code.

The problem is simple enough: efficiently search for the required data and gather the results into some appropriate digital format. The format, in this specific case, is CSV because it will ultimately be used in marketing materials. The ability to sort, tabulate, and edit it in an application like Excel is preferred.

The solution hinges on a two crucial bits of information. The first is that all the practices and practitioners in the US have to register for an NPI number with the federal department of Health and Human Services. The second is that this registry of NPI data is available online with a pretty well documented API. Understanding the problem and knowing these two crucial bits, work began. If you’d like to take a gander at the source code, you eager beaver, you can check it out on my GitHub.

Step 1: Connect to the API

Making HTTP connections in Ruby 2.x is made almost trivial by the Net::HTTP library. It’s really a whole topic unto itself but, simply put, the Net:HTTP allows you to connect to resources (like APIs) via the HTTP and HTTPS protocols. You can include it and all its tasty goodness into your project by requiring it at the top of your code.

require 'net/http'

Making the connection happens in two parts: creating the URI (i.e. resource’s URL) to which you need to connect and then initiating the connection.

def build_uri

URI(
"https://npiregistry.cms.hhs.gov/api/?"\
"number=#{npi_number}"\
"&enumeration_type=#{enumeration_type}"\
"&taxonomy_description=#{taxonomy_description}"\
"&first_name=#{first_name}"\
"&last_name=#{last_name}"\
"&organization_name=#{organization_name}"\
"&address_purpose=#{address_purpose}"\
"&city=#{city}"\
"&state=#{state}"\
"&postal_code=#{postal_code}"\
"&country_code=#{country_code}"\
"&limit=#{limit}"\
"&skip=#{skip}"
)

end

I encapsulated the creation of the URI into a method. For my purposes, the URI method of the Net::HTTP library is simply being passed the NPI Registry’s API URL as a string. I broke the string up for better readability.

You’ll likely notice the “number=#{npi_number}” lines in the URL. I defined each of the API’s available search criteria as an attr_accessor of my class (“NpiDownload”). Ruby’s attr_accessors are a deeper topic for a future post but defining the criteria this way meant any of the criteria could be set by a simple assignment like the following line.

NpiDownload.postal_code = "11111"

All that said, the build_uri method would return a URI object built from the following string.

https://npiregistry.cms.hhs.gov/api/?number=&enumeration_type=&
taxonomy_description=&first_name=&last_name=&organization_name=&
address_purpose=&city=&state=&postal_code=11111&country_code=&limit=&skip=

Now that we had our URI object, it could be used to make the connection and deal with whatever the API returned.

Step 2: Submit the Query

Having worked with a few APIs, I knew that the results it returned would be in JSON format (and the API documentation said as much). This means that, while the response would be a string, it would be collected into keys and values. So I pulled out Ruby’s JSON library to handle this heavy lifting. Similarly to Net::HTTP, you can get the JSON’s library’s deliciousness with the following line at the top of your code.

require 'json'

With the URI built to the API’s spec and prepping for the JSON business, I could then submit the call to the API with the following method.

def query

    JSON.parse(Net::HTTP.get(self.build_uri))

end

By your silence, I can tell that reveal didn’t quite impress. Believe me when I say that single line method is actually doing quite a lot.

Starting from the innermost parenthesis and working outward, the query method is calling “self.build_uri”. From Step 1 above, “build_uri” is the method I defined to “build” the URI object for the HTTP connection to the API.

“What’s with the ‘self’ bit,” I hear you ask (or is that the voices again?). “self” is a whole topic in Object-Oriented Programming but without getting metaphysical, “self” is basically a method call on the currently instantiated object. So no matter how many instances of the “NpiDownload” object I create, query is always calling build_uri one the same instance on which query was called. Cool? Cool.

The next parenthetical statement is Net::HTTP.get. This is taking the URI object created and returned by the self.build_uri call and issuing a GET request to the API’s URL. This is pretty straightforward, really. Think of this step like copying the API’s URL into the address bar of your browser (that’s what the .get method call is doing).

Finally, the JSON.parse part is taking the JSON string returned by the API and converting it into a nested hash.

Hashes, and collections in general, are a meaty topic in Ruby. If you’re not familiar, a hash is really just a collection of data organized into keys and values. “Hey,” you say, “that’s how JSON is organized!” I see you’ve been paying attention. With the URI built and the connection made, I just needed to get the results into a format we could use.

To be Continued…

That’s enough jabber from me for one post. In the next post, I’ll talk about the third step is this little project and give a post mortem on the project as a whole. The source code can be found on my GitHub, if you can’t take the suspense. And, of course, leave any questions or comments below or feel free to hit me up on the Twitter.

Until next time, take care!