logo
    Home Blog Contact
  • Home
  • Blog
  • Contact
  • designer
  • cpp
  • apps
  • c++
  • editior
  • javascript
  • nodejs
  • react
  • linux
  • python

Guide Command Line Data Manipulation Cli Miller

  • Alvin Bryan
  • December 29, 2022
  •   5 min read
  •   Share on Twitter
  • LinkedIn

Quick summary ↬  No more random scripts in Python and JavaScript to transform CSV or JSON data. In this article, Alvin Bryan shows you how to use Miller, a small and powerful CLI tool, to do all your data processing.

config.yml Allow me to preface this article by saying that I’m not a terminal person. I don’t use Vim. I find sed, grep, and awk convoluted and counter-intuitive. I prefer seeing my files in a nice UI. Despite all that, I add your code here got into the habit of reaching for command-line interfaces (CLIs) when I had small, dedicated tasks to complete. Why? I’ll explain all of that below. In this article, you’ll also learn how to use a CLI tool named Miller to manipulate data from CSV, TSV and/or JSON files.

Why Use The Command Line? #

Everything that I’m showing config.yml here can be done with regular code. You can load the file, parse the CSV data, and then transform it using regular JavaScript, Python, or any other language. But there are a few reasons why I reach out for command-line interfaces (CLIs) whenever I need to transform data:

  • Easier to read. It is faster (for me) to write a script in JavaScript or Python for my usual data processing. But, a script can be confusing to come back to. In my experience, command-line manipulations are harder to write initially but easier to read afterward.

  • Easier to reproduce. Thanks to package managers like Homebrew, CLIs are much easier to install than they used to be. No need to figure out the correct version of Node.js or Python, the package manager takes care of that for you.

  • Ages well. Compared to modern programming languages, CLIs are old. They change a lot more slowly than languages and frameworks.

What Is Miller? #

The main reason I love Miller is that it’s a standalone tool. There are many great tools for data manipulation, but every other tool I found was part of a specific ecosystem. The tools written in Python required knowing how to use pip and virtual environments; for those written in Rust, it was cargo, and so on.

On top of that, it’s fast. The data files are streamed, not held in memory, which means that you can perform operations on large files without freezing your computer.

As a bonus, Miller is actively maintained, John Kerl really keeps on top of PRs and issues. As a developer, I always get a satisfying feeling when I see a neat and maintained open-source project with great documentation.

More after jump! Continue reading below
  • AWS Amplify

    A React Data Grid that makes a developer’s

    Learn more

More after jump! Continue reading below
  • A React Data Grid that makes a developer’s

    Get a Data Grid!

1
2
# some code
echo "Hello World"

The JSON looks like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
    [
      {
        "title": "apples",
        "count": [12000, 20000],
        "description": {"text": "...", "sensitive": false}
      },
      {
        "title": "oranges",
        "count": [17500, null],
        "description": {"text": "...", "sensitive": false}
      }
    ]

Code block with Hugo’s internal highlight shortcode

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <title>Example HTML5 Document</title>
  </head>
  <body>
    <p>Test</p>
  </body>
</html>

Compatibility

Language Hugo Code HTML lang Attribute Theme Docs Lunr.js Support
English en en
Simplified Chinese zh-cn zh-CN :(far fa-check-square fa-fw): :(far fa-check-square fa-fw):
Traditional Chinese zh-tw zh-TW :(far fa-square fa-fw): :(far fa-check-square fa-fw):
French fr fr :(far fa-square fa-fw): :(far fa-check-square fa-fw):
Polish pl pl :(far fa-square fa-fw): :(far fa-square fa-fw):
Brazilian Portuguese pt-br pt-BR :(far fa-square fa-fw): :(far fa-check-square fa-fw):
Italian it it :(far fa-square fa-fw): :(far fa-check-square fa-fw):
Spanish es es :(far fa-square fa-fw): :(far fa-check-square fa-fw):
German de de :(far fa-square fa-fw): :(far fa-check-square fa-fw):
German de de :(far fa-square fa-fw): :(far fa-check-square fa-fw):
Serbian sr sr :(far fa-square fa-fw): :(far fa-square fa-fw):
Russian ru ru :(far fa-square fa-fw): :(far fa-check-square fa-fw):
Romanian ro ro :(far fa-square fa-fw): :(far fa-check-square fa-fw):
Vietnamese vi vi :(far fa-square fa-fw): :(far fa-check-square fa-fw):
Arabic ar ar :(far fa-square fa-fw): :(far fa-check-square fa-fw):
Catalan ca ca :(far fa-square fa-fw): :(far fa-square fa-fw):
Thai th th :(far fa-square fa-fw): :(far fa-check-square fa-fw):
Telugu te te :(far fa-square fa-fw): :(far fa-square fa-fw):
Indonesian id id :(far fa-square fa-fw): :(far fa-square fa-fw):
Turkish tr tr :(far fa-square fa-fw): :(far fa-check-square fa-fw):
Korean ko ko :(far fa-square fa-fw): :(far fa-square fa-fw):
Hindi hi hi :(far fa-square fa-fw): :(far fa-square fa-fw):

About The Author

Alvin Bryan is a Developer Advocate and an avid online learner. He is currently working at Contentful, previously at the Wall Street Journal.

Email Newsletter

Table of Contents

  • Why Use The Command Line? #
  • What Is Miller? #
    • Compatibility
Interface Design Patterns UX Training
SmashingConf Front-End 2023
AWS
  • C++
  • Systems programming
  • C programming
  • Beautiful code series
  • Design patterns
  • Linux
  • Open Source
  • Algorithms
  • Data Structures
  • System design
  • Distributed systems
  • Kernel programming
  • Assembly language
  • Hardware
  • Ultra Low Latency
  • Inspiration

Unhealthy love with dark corners of C++

Founded by an Introvert who writes to help others without showing his face. 2021–2023.

Fluentprogrammer.com is a brand name managed by Abyan, Inc.

  • About us (Impressum)
  • Privacy policy