Fundamentals 17 min read

Advanced NumPy Functions for Array Creation, Manipulation, and Analysis

This article introduces a collection of lesser‑known NumPy functions—including np.full_like, np.logspace, np.meshgrid, np.triu, np.ravel, np.vstack, np.r_, np.where, np.allclose, np.argsort, np.isneginf, np.polyfit, np.clip, np.count_nonzero, and np.array_split—demonstrating their usage with code examples and visualizations for data‑science and scientific‑computing tasks.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Advanced NumPy Functions for Array Creation, Manipulation, and Analysis

NumPy is a core Python package for scientific computing, providing efficient creation and manipulation of numerical arrays. Beyond the common functions, many powerful utilities can simplify data‑science workflows.

np.full_like creates an array with the same shape as an existing one, filled with a custom constant (e.g., π):

array = np.array([[1, 4, 6, 8], [9, 4, 4, 4], [2, 7, 2, 3]])
array_w_inf = np.full_like(array, fill_value=np.pi, dtype=np.float32)
print(array_w_inf)
# [[3.1415927 3.1415927 3.1415927 3.1415927]
#  [3.1415927 3.1415927 3.1415927 3.1415927]
#  [3.1415927 3.1415927 3.1415927 3.1415927]]

np.logspace generates numbers spaced evenly on a logarithmic scale, useful when a geometric progression is required:

log_array = np.logspace(start=1, stop=100, num=15, base=np.e)
print(log_array)

np.meshgrid builds coordinate matrices from coordinate vectors, enabling vectorised evaluation of functions over a grid (e.g., 2‑D sine surface):

x = [1, 2, 3, 4]
y = [3, 5, 6, 8]
xx, yy = np.meshgrid(x, y)
print(xx)
print(yy)
# Plotting example
plt.plot(xx, yy, linestyle="none", marker="o", color="red")

np.triu / np.tril return the upper or lower triangular part of a matrix, often used to mask heatmaps of correlation matrices:

import seaborn as sns
diamonds = sns.load_dataset("diamonds")
matrix = diamonds.corr()
mask = np.triu(np.ones_like(matrix, dtype=bool))
sns.heatmap(matrix, mask=mask, square=True, annot=True, fmt=".2f", center=0)

np.ravel / np.flatten convert multi‑dimensional arrays to 1‑D. ravel returns a view when possible, while flatten always returns a copy:

array = np.random.randint(0, 10, size=(4, 5))
print(array.ravel())
print(array.flatten())

np.vstack / np.hstack stack arrays vertically or horizontally after reshaping them to compatible dimensions, a common pattern in Kaggle competitions for merging predictions:

array1 = np.arange(1, 11).reshape(-1, 1)
array2 = np.random.randint(1, 10, size=10).reshape(-1, 1)
hstacked = np.hstack((array1, array2))
print(hstacked)

array1 = np.arange(20, 31).reshape(1, -1)
array2 = np.random.randint(20, 31, size=11).reshape(1, -1)
vstacked = np.vstack((array1, array2))
print(vstacked)

np.r_ / np.c_ are convenient concatenation operators that avoid explicit reshaping. They stack 1‑D arrays as rows ( r_ ) or columns ( c_ ).

preds1 = np.random.rand(100)
preds2 = np.random.rand(100)
as_rows = np.r_[preds1, preds2]
as_cols = np.c_[preds1, preds2]
print(as_rows.shape)  # (200,)
print(as_cols.shape)  # (100, 2)

np.info prints the docstring of any NumPy object, providing quick reference without opening external documentation.

np.info(np.info)

np.where returns indices where a condition holds, useful for extracting elements that satisfy a threshold:

probs = np.random.rand(100)
idx = np.where(probs > 0.8)
print(probs[idx])

np.all / np.any evaluate boolean conditions across an array, often combined with assert for data validation.

array1 = np.random.rand(100)
array2 = np.random.rand(100)
print(np.all(array1 == array2))   # usually False
print(np.any(array1 == array2))   # may be True for integer arrays

np.allclose checks whether two arrays are element‑wise equal within a tolerance, handy for comparing floating‑point results.

a1 = np.arange(1, 10, 0.5)
a2 = np.arange(0.8, 9.8, 0.5)
print(np.allclose(a1, a2, rtol=0.3))

np.argsort returns the indices that would sort an array, enabling reuse of the ordering for multiple operations.

random_ints = np.random.randint(1, 100, size=20)
idx = np.argsort(random_ints)
print(random_ints[idx])

np.isneginf / np.isposinf detect negative or positive infinity values within arrays.

a = np.array([-9999, 99999, -np.inf])
print(np.any(np.isneginf(a)))

np.polyfit performs polynomial (including linear) regression directly on NumPy arrays, returning coefficients such as slope and intercept.

X = diamonds["carat"].values.flatten()
y = diamonds["price"].values.flatten()
slope, intercept = np.polyfit(X, y, deg=1)
print(slope, intercept)

np.clip limits values to a specified interval, useful for enforcing hard bounds on data (e.g., ages between 10 and 70).

ages = np.random.randint(1, 110, size=100)
limited_ages = np.clip(ages, 10, 70)
print(limited_ages)

np.count_nonzero counts the number of non‑zero elements in an array, often used to assess sparsity.

a = np.random.randint(-50, 50, size=100000)
print(np.count_nonzero(a))

np.array_split divides an array or DataFrame into a specified number of sub‑arrays, handling uneven splits gracefully.

import datatable as dt
df = dt.fread("data/train.csv").to_pandas()
splitted = np.array_split(df, 100)
print(len(splitted))

The examples above illustrate how these functions can streamline data preprocessing, feature engineering, and exploratory analysis in Python‑based scientific and machine‑learning pipelines.

statisticsdata scienceNumPyarray-manipulation
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.