Resources > Articles

MATLAB vs. Python NumPy for Academics Transitioning into Data Science

MATLAB vs. Python NumPy

 

This technical article was written for The Data Incubator by Dan Taylor, a Fellow of our 2017 Spring cohort in Washington, DC. 

This article was originally published on October 25, 2017, on The Data Incubator.

 

For many of us with roots in academic research, MATLAB was our first introduction to data analysis. However, due to its high cost, MATLAB is not very common beyond the academy. It is simply too expensive for most companies to be able to afford a license. Luckily, for experienced MATLAB users, the transition to free and open source tools, such as Python’s NumPy, is fairly straight-forward.

MATLAB has several benefits when it comes to data analysis. Perhaps most important is its low barrier of entry for users with little programming experience. MathWorks has put a great deal of effort into making MATLAB’s user interface both expansive and intuitive. This means new users can quickly get up and running with their data without knowing how to code. It is possible to import, model, and visualize structured data without typing a single line of code. Because of this, MATLAB is a great entrance point for scientists into programmatic analysis. Of course, the true power of MATLAB can only be unleashed through more deliberate and verbose programming, but users can gradually move into this more complicated space as they become more comfortable with programming. MATLAB’s other strengths include its deep library of functions and extensive documentation, a virtual “instruction manual” full of detailed explanations and examples.

MATLAB’s main drawbacks, when it comes to analysis, stem from its proprietary nature. The source code is hidden from the user and any programs written with MATLAB can solely be used by MATLAB license holders. With this in mind, it is important for academics transitioning into professional data science to broaden their skill set to include free and open source tool kits.

Python is often a data scientist’s first choice for data analysis. It is an open source, general programming language with countless libraries that aid in data analysis and manipulation. Because Python does not include a user interface, data scientists need to utilize a third-party user interface. Such interfaces allow nearly all of MATLAB’s functionality to be reproduced in Python.

All MATLAB users should become well-acquainted with NumPy, an essential Python library. NumPy provides the basic “array” data structure, which forms the backbone of multidimensional matrices and high-level data science packages, including pandas and scikit-learn.

Proficient MATLAB users should find NumPy to be quite intuitive, as the NumPy array functions very similarly to MATLAB’s cell array data structure. The biggest challenge could very well be learning the syntactic differences between the languages. Here are some examples of equivalent code in both languages:

Python example code:

In [1]: import NumPy as np

In [2]: a = np.array([1,2,3,4]); b = np.array([5,6,7,8])

In [3]: a[0]

Out[3]: 1

In [4]: a[1:3]

Out[4]: array([2, 3])

In [5]: a * b

Out[5]: array([ 5, 12, 21, 32])

In [6]: a / b

Out[6]: array([0, 0, 0, 0])

In [7]: a * 1.0 / b

Out[7]: array([ 0.2       ,  0.33333333,  0.42857143,  0.5       ])

 

MATLAB example code:

>> a = [1 2 3 4]; b = [5 6 7 8];

>> a(1)

ans = 1

>> a(2:3)

ans = 2     3

>> a .* b

ans =     5    12    21    32

>> a ./ b

ans =    0.2000    0.3333    0.4286    0.5000

Both languages support vectorization and easy element-by-element operations — care needs to be taken with MATLAB as the default operations are often matrix operations.

  • Defining an array in Python requires passing the NumPy function a list, whereas in MATLAB, defining a vector is very flexible and does not require commas.
  • In Python, indexing starts at 0 and is performed with brackets, whereas in MATLAB indexing begins at 1 and is performed with parentheses.
  • In Python, slicing is left inclusive and right exclusive, whereas in MATLAB slicing is inclusive at both ends.
  • In Python, the element type of an array is decided when the array is defined. In MATLAB, the default element data type is a double float, which is important when performing element-by-element division.

 

Ultimately, every aspiring data scientist should be familiar with the variety of tools available to them. Those who are transitioning from academic research will find Python’s NumPy library to be a natural transition point because of its similarity to the MATLAB programming language. Proficiency in NumPy brings the data scientist one step closer to unlocking Python’s full potential for comprehensive data analytics.

Author

Other Resources in this Series

Most Recent

Prism photo: Product management Lessons from Pink Floyd
Article

Product Management Lessons from Pink Floyd: a Lighthearted Look into Their Epic Music and Unlikely Product Expertise

Few people (actually, no one!) spontaneously associate product management with Pink Floyd, but if you look closely, you can find good examples of best product management practices in their journey, as I hope to reveal...
Creating a product roadmap: what should you include
Article

A Guide to Product Roadmaps: How to Build One That Works

A product roadmap is a frequent request from the sales force and others in the company. ‘What’s coming in the next release and the ones after that?’ Long buying cycles common with strategic products often...
Dry erase board with product roadmap drawn on it
Article

How to Build a Brilliant Visual Product Roadmap

Building roadmaps is a crucial part of a product manager’s job. Yet most product managers still use outdated tools for roadmapping—Excel, PowerPoint, wikis, etc. The good news is that there’s a better way. Executives have...
Product Datasheet
Article

How to Write a Kick-Butt Product Datasheet

If your datasheet passes the all-important skimming test, it's more likely that buyers will read it in detail. Here are 10 tips to help you write a datasheet that buyers actually read.
Person working on project management software on a tablet
Article

10 Tools for Product Managers 

The right product tools can make it easier to manage your team, but there are hundreds out there—so how do you choose which one is right for you?

OTHER ArticleS

Prism photo: Product management Lessons from Pink Floyd
Article

Product Management Lessons from Pink Floyd: a Lighthearted Look into Their Epic Music and Unlikely Product Expertise

Few people (actually, no one!) spontaneously associate product management with Pink Floyd, but if you look closely, you can find good examples of best product management practices in their journey, as I hope to reveal...
Creating a product roadmap: what should you include
Article

A Guide to Product Roadmaps: How to Build One That Works

A product roadmap is a frequent request from the sales force and others in the company. ‘What’s coming in the next release and the ones after that?’ Long buying cycles common with strategic products often...

Sign up to stay up to date on the latest industry best practices.

Sign up to received invites to upcoming webinars, updates on our recent podcast episodes and the latest on industry best practices.

Training on Your Schedule

Fill out the form today and our sales team will help you schedule your private Pragmatic training today.

Subscribe

Subscribe