Artificial Intelligence 14 min read

Understanding TensorFlow Internals with TensorSlow: Computational Graph, Forward/Backward Propagation, and Building an MLP

This article explains how Huajiao Live leverages Spark for data preprocessing and TensorFlow (augmented by the TensorSlow project) for distributed deep‑learning training, detailing computational‑graph concepts, forward and backward propagation, loss construction, gradient‑descent optimization, and a step‑by‑step Python implementation of a multi‑layer perceptron.

360 Tech Engineering

Aug 28, 2019

Understanding TensorFlow Internals with TensorSlow: Computational Graph, Forward/Backward Propagation, and Building an MLP

The author, a former algorithm engineer at Huajiao Live, introduces the deep‑learning workflow used in Huajiao’s live‑stream recommendation system, beginning with Spark‑based data cleaning to build user and item profiles stored in HDFS.

TensorFlow serves as the core deep‑learning framework; training jobs are scheduled with Hbox, and models are deployed via TF‑Serving wrapped in a TF‑Web service, while Go servers provide online recommendation APIs.

TensorFlow, an open‑source framework released by Google in 2015, contains over a million lines of code split between front‑end and back‑end components, making its inner workings opaque to many. The GitHub project TensorSlow re‑implements TensorFlow’s core in pure Python to illustrate these mechanisms without concern for performance.

Deep learning, a branch of machine learning, studies deep neural networks. A typical feed‑forward network maps inputs x to outputs y using parameters θ, with hidden layers representing composite functions. The cost function J(θ) measures the distance between model predictions and data, and gradient descent iteratively updates θ to minimize J.

The computational graph is the language of TensorFlow. Nodes represent variables, placeholders, or operations. For example, a Placeholder node is defined as:

class placeholder:
    def __init__(self):
        self.consumers = []
        _default_graph.placeholders.append(self)

A Variable node holds model parameters:

class Variable:
    def __init__(self, initial_value=None):
        self.value = initial_value
        self.consumers = []
        _default_graph.variables.append(self)

An Operation node combines inputs:

class Operation:
    def __init__(self, input_nodes=[]):
        self.input_nodes = input_nodes
        self.consumers = []
        for input_node in input_nodes:
            input_node.consumers.append(self)
        _default_graph.operations.append(self)
    def compute(self):
        pass

Execution is performed by a Session which traverses the graph in topological order and computes each node:

class Session:
    def run(self, operation, feed_dict={}):
        """Computes the output of an operation"""
        ...

def traverse_postorder(operation):
    nodes_postorder = []
    def recurse(node):
        if isinstance(node, Operation):
            for input_node in node.input_nodes:
                recurse(input_node)
        nodes_postorder.append(node)
    recurse(operation)
    return nodes_postorder

Forward propagation is illustrated with a simple affine transformation graph, implemented as:

# Create a new graph
Graph().as_default()
# Variables
A = Variable([[1, 0], [0, -1]])
b = Variable([1, 1])
# Placeholder
x = placeholder()
# Hidden node
y = matmul(A, x)
# Output node
z = add(y, b)
session = Session()
output = session.run(z, {x: [1, 2]})
print(output)

The loss for classification uses cross‑entropy: <code># Cross‑entropy loss J = negative(reduce_sum(reduce_sum(multiply(c, log(p)), axis=1))) </code> Gradient descent is encapsulated in an optimizer class: <code>class GradientDescentOptimizer: def __init__(self, learning_rate): self.learning_rate = learning_rate def minimize(self, loss): class MinimizationOperation(Operation): def compute(self): grad_table = compute_gradients(loss) for node in grad_table: if type(node) == Variable: node.value -= self.learning_rate * grad_table[node] return MinimizationOperation() </code> Backward propagation computes gradients using the chain rule. Starting from the loss node (gradient = 1), each node aggregates gradients from its consumers, multiplies by its local derivative, and propagates upstream. The helper compute_gradients performs a BFS‑style traversal: <code>def compute_gradients(loss): # grad_table[node] will contain the gradient of the loss w.r.t. the node's output ... return grad_table </code> Finally, a multi‑layer perceptron (MLP) with three hidden layers is built and trained: <code># Build a new graph ts.Graph().as_default() # Placeholders X = ts.placeholder() c = ts.placeholder() # Hidden layers W_hidden1 = ts.Variable(np.random.randn(2, 4)) b_hidden1 = ts.Variable(np.random.randn(4)) p_hidden1 = ts.sigmoid(ts.add(ts.matmul(X, W_hidden1), b_hidden1)) W_hidden2 = ts.Variable(np.random.randn(4, 8)) b_hidden2 = ts.Variable(np.random.randn(8)) p_hidden2 = ts.sigmoid(ts.add(ts.matmul(p_hidden1, W_hidden2), b_hidden2)) W_hidden3 = ts.Variable(np.random.randn(8, 2)) b_hidden3 = ts.Variable(np.random.randn(2)) p_hidden3 = ts.sigmoid(ts.add(ts.matmul(p_hidden2, W_hidden3), b_hidden3)) # Output layer W_output = ts.Variable(np.random.randn(2, 2)) b_output = ts.Variable(np.random.randn(2)) p_output = ts.softmax(ts.add(ts.matmul(p_hidden3, W_output), b_output)) # Loss J = ts.negative(ts.reduce_sum(ts.reduce_sum(ts.multiply(c, ts.log(p_output)), axis=1))) # Optimizer minimization_op = ts.train.GradientDescentOptimizer(learning_rate=0.03).minimize(J) # Training loop session = ts.Session() for step in range(2000): J_value = session.run(J, feed_dict) if step % 100 == 0: print("Step:", step, "Loss:", J_value) session.run(minimization_op, feed_dict) </code> Visualization of the decision boundary shows that the MLP learns a complex non‑linear relationship. The article concludes that TensorSlow offers a clear view of deep‑learning framework internals, while TensorFlow provides production‑grade performance, distributed execution, and extensive tooling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python deep learning TensorFlow MLP Computational Graph

Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.