The results of the survey will be published in a series of upcoming e-mail newsletters. To be sure you receive the results of the survey, if you haven’t already signed up for my free e-mail newsletter, please click here to subscribe.
Complete the 2019 Utility Fee Survey
If you would like to participate, please click here to complete the survey. It should take less than five minutes to complete.
A big part of my work, in
addition to sales presentations and consulting, is data analysis
and extraction. Data extraction often involves deciphering older, flat file
databases. To accomplish this, I rely on several software tools to make my job
easier. In today’s issue, I’ll share some of those tools with you.
If you’re familiar with
Notepad, the default Windows text editor, it can be frustrating to work with if
you’re trying to manipulate or better understand text files. Looking at text
files with long records, such as a meter reading interface file or outsource billing
export file, can be frustrating with Notepad because it uses
word wrap. With word wrap, trying to determine where one line ends and the next
line starts can be difficult.
My personal favorite text
editor is TextPad.
TextPad offers several features not available in Notepad. In addition to not
word wrapping, it also displays the line and column positions of where the
cursor is located, which comes in very handy when trying to determine if data
is in the correct position. The screenshot below shows the cursor is on line
65, column 648.
TextPad can also open files
in binary mode, which shows each byte of data as the two-digit hexadecimal value.
This is most valuable when trying to find binary or packed numeric data in a
flat file. In the illustration below, 129906 is the current meter reading, but
because it is stored as binary data, it’s not visible in the margin like the
meter number (02779619) is.
Another TextPad feature I
use frequently is Find in Files. This allows me to search all files, or a
wildcard representation, for a particular value. I often use this when first
trying to determine which files are important for a data conversion. If I know
an account number, I can search all the data files to see which of those file
contain the account number.
me start by saying Excel is a terrible tool to use to exchange data. Excel
notoriously tries to convert everything it can to numeric data. This means
leading zeros get dropped from fields where they are important (like the meter
number in the illustration above). Also, long numeric fields often get
converted to scientific notation.
that doesn’t mean Excel isn’t a valuable tool for data analysis, just don’t use
it to exchange files!
of the most obvious uses for Excel is to open delimited (tab, CSV, pipe) files.
Excel arranges the data in columns, making it much easier to review than
opening a delimited file in a text editor. Once data is in columns, it can
easily be sorted and filtered.
Filtering data is another feature I use frequently.
If I’m working with a file that has a column of codes and I want to know how
many unique codes are represented, Excel’s filter function is what I use. In
the screenshot below, I have a column called Account Class. To determine what
codes are represented in the Account Class column, all I had to do is turn on
filtering and click the dropdown arrow for the Account Class column and it
shows me the three unique values of C, M, and R:
One last tool I use less frequently than the others, but still rely on, is CSVed. As the name implies, this is a CSV, or any delimited file, editor. Excel has a limit to how large a file it can open which can be frustrating with very large files. CSVed doesn’t have that limitation and it will also allow me to filter records to save a subset of a large file.
For example, If I’m trying to analyze a history file that has billing, payment, and adjustment records, CSVed provides the functionality to export just one of the record types, for example, adjustments. This provides a much more manageable file to analyze and open in Excel.
Need assistance extracting old data?
If you’re trying to extract
data from an old database and need assistance, please give me a call at
919-232-2320 or e-mail me at firstname.lastname@example.org to learn how I could assist you.
I am the Senior Consultant with Edmunds GovTech | Logics in Raleigh, North Carolina. I have over 35 years experience developing and implementing utility billing and financial software and consulting with utilities and municipalities. My bi-weekly email newsletter draws from my experience in working with over 200 utilities and local governments to offer insight into how utilities can improve operations and better serve their customers. If you have a comment or a suggestion for a future email, please contact me by calling 919-673-4050 or sending an email to email@example.com