Frontend Development 14 min read

Understanding Selenium WebDriver, HTML Fundamentals, and Element Locators (XPath & CSS Selectors)

This article explains the distinction between selenium‑webdriver and browser drivers, introduces HTML basics and element attributes, and details Selenium's element‑locating APIs with practical examples of XPath and CSS selector strategies for UI automation.

DevOps Cloud Academy
DevOps Cloud Academy
DevOps Cloud Academy
Understanding Selenium WebDriver, HTML Fundamentals, and Element Locators (XPath & CSS Selectors)

In my current work I use the UI automation framework pytest+selenium . I discovered that selenium‑webdriver is a library that wraps the browser's native API, while a webdriver is the driver software provided by the browser vendor and implements the W3C WebDriver protocol.

The W3C WebDriver protocol defines a remote‑control interface that allows scripts to operate browsers across platforms and languages, effectively giving us a programmable backdoor to manipulate the DOM.

In summary, Selenium commands drive the webdriver , which in turn interacts with the browser engine; differences among browsers lead to variations in how elements are handled, so Selenium‑webdriver must account for each browser.

HTML Overview

UI automation operates on HTML documents, so a quick recap of HTML is useful. HTML consists of tags, attributes, and text. Example tags include <!DOCTYPE HTML> , <!--...--> , <html> , and <body> .

Tags are enclosed in angle brackets, usually appear in pairs (e.g., <html>…</html> ) but some are self‑closing like <br> or <hr> . An element is a tag together with its content, and element attributes are key‑value pairs written inside the start tag.

Boolean attributes such as disabled can be declared without a value to indicate a disabled input field.

<!DOCTYPE HTML> # DOCTYPE indicates an HTML document
<!--...--> # HTML comment
<html> # Root element
<body> # Body content

Selenium API

Selenium provides many locating APIs based on different element attributes. The most common methods are:

Locator

Single‑element API

Multiple‑element API

Example

id

find_element_by_id()

find_elements_by_id()

driver.find_element_by_id("result_logo")

name

find_element_by_name()

find_elements_by_name()

driver.find_element_by_name("f")

class_name

find_element_by_class_name()

find_elements_by_class_name()

driver.find_element_by_class_name("fm")

tag_name

find_element_by_tag_name()

find_elements_by_tag_name()

driver.find_element_by_tag_name("a")

link_text

find_element_by_link_text()

find_elements_by_link_text()

driver.find_element_by_link_text("index")

partial_link_text

find_element_by_partial_link_text()

find_elements_by_partial_link_text()

driver.find_element_by_partial_link_text("in")

xpath

find_element_by_xpath()

find_elements_by_xpath()

css selector

find_element_by_css_selector()

find_elements_by_css_selector()

find_element returns the first matching element; find_elements returns a list of all matches, raising NoSuchElementException if none are found. Because unique attributes are rare, XPath or CSS selectors are often preferred.

Best locating practices:

Avoid deep hierarchical XPath; prefer relative paths.

Prefer CSS selectors over XPath for better performance, as XPath requires the driver to traverse the entire DOM tree.

XPath Locating

XPath (XML Path Language) works on HTML because HTML’s tree structure mirrors XML. Nodes represent elements, attributes, or text. XPath expressions can be absolute (starting from the root) or relative (starting with // ).

Examples:

/html/body/div[1]/div[1]/div[5]/div/div/form/span/input[1]   # absolute path
//input[@id='kw']                                            # relative path

Predicates inside [] filter node sets, supporting logical operators and , or , and the union operator | . Axes such as parent , child , ancestor , descendant , etc., allow navigation relative to the current node.

Example using axes:

//form/div[last()-1]/ancestor::div[@class='modal-content']

CSS Selector Locating

CSS selectors are another powerful way to locate elements and are generally faster than XPath because they are native to the browser’s rendering engine.

Selector

Example

Meaning

element

$('input')

All

input

elements

#id

$('#kw')

Element with

id="kw"

.class

$('.s_ipt')

Elements with class

s_ipt

[attribute]

$('[type]')

Elements that have a

type

attribute

[attribute=value]

$('[name="wd"]')

Elements where

name="wd"

e1>e2

$('span>input')

input

that is a direct child of a

span

e1 e2

$('a div')

div

that is a descendant of an

a

e1+e2

$('div+a')

a

immediately following a

div

e1:nth-child(n)

$('span>span:nth-child(1)')

First

span

child of a

span

Examples in Selenium code:

find_element_by_css_selector("input#kw")               # input with id="kw"
find_element_by_css_selector("input.s_ipt")           # input with class="s_ipt"
find_element_by_css_selector('a[src$=".pdf"]')       #
whose src ends with .pdf
find_element_by_css_selector("[name='wd'][autocomplete='off']")  # element with two specific attributes

About the Author

The author, Ze Yang, is a DevOps practitioner who shares enterprise‑level DevOps operations and development techniques, focusing on Linux, automation, and related courses.

Promotional material for a DevOps pipeline course is included at the end of the original page.

UI AutomationHTMLSeleniumxpathwebdriverCSS Selector
DevOps Cloud Academy
Written by

DevOps Cloud Academy

Exploring industry DevOps practices and technical expertise.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.