Skip to main content

How to check if a string is a valid URL in Python

How to check if a string is a valid URL in Python.

Here's a step-by-step tutorial on how to check if a string is a valid URL in Python:

Step 1: Import the necessary modules

To perform URL validation in Python, you will need to import the re module, which provides support for regular expressions. You can import it using the following line of code:

import re

Step 2: Define a function to check if a string is a valid URL

Create a function called is_valid_url that takes a string as input and returns a boolean value indicating whether the string is a valid URL. Inside the function, you can use regular expressions to validate the URL. Here's an example implementation:

def is_valid_url(url):
# Regular expression pattern to match a valid URL
pattern = re.compile(
r'^https?://' # http:// or https://
r'([a-z0-9.-]+)' # domain name
r'(\.[a-z]{2,4})' # dot-something
r'(/[a-zA-Z0-9%._-]+)*' # path
r'(\?[a-zA-Z0-9&=]+)*' # query string
r'(#[a-zA-Z0-9_-]+)?$' # fragment identifier
, re.IGNORECASE)

# Use the pattern to match the URL
match = re.match(pattern, url)

# Return True if the URL is valid, otherwise False
return bool(match)

Let's break down the regular expression pattern used in this example:

  • ^https?://: Matches the start of the string followed by http:// or https://.
  • ([a-z0-9.-]+): Matches the domain name, which can consist of lowercase letters, numbers, dots, and hyphens.
  • (\.[a-z]{2,4}): Matches the top-level domain (e.g., .com, .org). It allows 2 to 4 lowercase letters.
  • (/[a-zA-Z0-9%._-]+)*: Matches the path, which can include alphanumeric characters, percent-encoded characters, dots, underscores, and hyphens. The path can be optional and can contain multiple segments separated by slashes.
  • (\?[a-zA-Z0-9&=]+)*: Matches the query string, which can include alphanumeric characters, ampersands, and equals signs. The query string can be optional.
  • (#[a-zA-Z0-9_-]+)?$: Matches the fragment identifier (e.g., #section). The fragment identifier can be optional.

Step 3: Test the function with example URLs

You can now test the is_valid_url function with different URLs to see if it correctly identifies them as valid or invalid. Here are a few examples:

# Test the function with example URLs
print(is_valid_url("https://www.example.com")) # True
print(is_valid_url("http://example.com")) # True
print(is_valid_url("www.example.com")) # False
print(is_valid_url("example.com")) # False

In this example, the first two URLs are valid and should return True, while the last two URLs are invalid and should return False.

That's it! You now have a function that can check if a string is a valid URL in Python.