Skip to content

core._base

Classes

IterateSortDict

IterateSortDict(reverse=False)

A class for recursively sorting dictionary keys with natural sorting.

This class provides methods to sort dictionary keys recursively, handling nested dictionaries and using natural sorting for strings with embedded numbers.

Attributes:

Name Type Description
reverse bool

If True, sorts keys in descending order. Defaults to False.

Example

sorter = IterateSortDict(reverse=False) data = {"item10": {"sub2": 1, "sub1": 2}, "item2": 3} sorted_data = sorter.dict_update(data)

Initialize the IterateSortDict instance.

Parameters:

Name Type Description Default
reverse bool

If True, sorts keys in descending order. Defaults to False.

False
Source code in pyformatjson/core/_base.py
def __init__(self, reverse: bool = False) -> None:
    """Initialize the IterateSortDict instance.

    Args:
        reverse (bool, optional): If True, sorts keys in descending order.
            Defaults to False.
    """
    self.reverse = reverse

Functions

dict_sort
dict_sort(old)

Sort dictionary keys using natural sorting.

This method sorts the top-level keys of the dictionary using natural sorting that handles embedded numbers.

Parameters:

Name Type Description Default
old dict

The dictionary whose keys are to be sorted.

required

Returns:

Name Type Description
dict dict

A new dictionary with sorted keys.

Source code in pyformatjson/core/_base.py
def dict_sort(self, old: dict) -> dict:
    """Sort dictionary keys using natural sorting.

    This method sorts the top-level keys of the dictionary using
    natural sorting that handles embedded numbers.

    Args:
        old (dict): The dictionary whose keys are to be sorted.

    Returns:
        dict: A new dictionary with sorted keys.
    """
    return {k: old[k] for k in sort_int_str(list(old.keys()), self.reverse)}
dict_sort_iteration
dict_sort_iteration(old)

Recursively sort nested dictionaries.

This method iterates through the dictionary and recursively sorts any nested dictionary values.

Parameters:

Name Type Description Default
old dict

The dictionary to be processed recursively.

required

Returns:

Name Type Description
dict dict

The dictionary with nested dictionaries sorted.

Source code in pyformatjson/core/_base.py
def dict_sort_iteration(self, old: dict) -> dict:
    """Recursively sort nested dictionaries.

    This method iterates through the dictionary and recursively sorts
    any nested dictionary values.

    Args:
        old (dict): The dictionary to be processed recursively.

    Returns:
        dict: The dictionary with nested dictionaries sorted.
    """
    for key in old:
        if isinstance(old[key], dict):
            old[key] = self.dict_update(old[key])
    return old
dict_update
dict_update(old)

Update and sort a dictionary recursively.

This method sorts the dictionary keys and recursively processes any nested dictionaries.

Parameters:

Name Type Description Default
old dict

The dictionary to be sorted and updated.

required

Returns:

Name Type Description
dict dict

The updated dictionary with sorted keys at all levels.

Source code in pyformatjson/core/_base.py
def dict_update(self, old: dict) -> dict:
    """Update and sort a dictionary recursively.

    This method sorts the dictionary keys and recursively processes
    any nested dictionaries.

    Args:
        old (dict): The dictionary to be sorted and updated.

    Returns:
        dict: The updated dictionary with sorted keys at all levels.
    """
    old = self.dict_sort_iteration(old)
    old = self.dict_sort(old)
    return old

Functions

sort_int_str

sort_int_str(str_int, reverse=False)

Sort list of strings with embedded numbers naturally.

This function sorts a list of strings using natural sorting that handles embedded numbers correctly (e.g., "item2" comes before "item10").

Parameters:

Name Type Description Default
str_int list[str]

List of strings to be sorted.

required
reverse bool

If True, sorts in descending order. Defaults to False.

False

Returns:

Type Description
list[str]

list[str]: Sorted list of strings.

Example

sort_int_str(["item10", "item2", "item1"]) ['item1', 'item2', 'item10']

Source code in pyformatjson/core/_base.py
def sort_int_str(str_int: list[str], reverse: bool = False) -> list[str]:
    """Sort list of strings with embedded numbers naturally.

    This function sorts a list of strings using natural sorting that handles
    embedded numbers correctly (e.g., "item2" comes before "item10").

    Args:
        str_int (list[str]): List of strings to be sorted.
        reverse (bool, optional): If True, sorts in descending order. Defaults to False.

    Returns:
        list[str]: Sorted list of strings.

    Example:
        >>> sort_int_str(["item10", "item2", "item1"])
        ['item1', 'item2', 'item10']
    """
    return sorted(str_int, key=sort_strings_with_embedded_numbers, reverse=reverse)

sort_strings_with_embedded_numbers

sort_strings_with_embedded_numbers(s)

Split string into pieces for natural sorting with embedded numbers.

This function splits a string into pieces where numbers are converted to integers for proper natural sorting (e.g., "item2" comes before "item10").

Parameters:

Name Type Description Default
s str

The string to be split into sortable pieces.

required

Returns:

Type Description
list[str]

list[str]: List of string pieces with numbers converted to integers.

Example

sort_strings_with_embedded_numbers("item10") ['item', 10]

Source code in pyformatjson/core/_base.py
def sort_strings_with_embedded_numbers(s: str) -> list[str]:
    """Split string into pieces for natural sorting with embedded numbers.

    This function splits a string into pieces where numbers are converted to integers
    for proper natural sorting (e.g., "item2" comes before "item10").

    Args:
        s (str): The string to be split into sortable pieces.

    Returns:
        list[str]: List of string pieces with numbers converted to integers.

    Example:
        >>> sort_strings_with_embedded_numbers("item10")
        ['item', 10]
    """
    re_digits = re.compile(r"(\d+)")
    pieces = re_digits.split(s)
    pieces[1::2] = map(int, pieces[1::2])
    return pieces

split_data_list

split_data_list(split_pattern, data_list, last_next='next')

Split data list according to the split pattern.

This function splits each string in the data list using the provided regex pattern and reconstructs the data based on the last_next parameter. The pattern must use capturing parentheses to define split points.

Parameters:

Name Type Description Default
split_pattern str

Regular expression pattern for splitting. Must use capturing parentheses, e.g., r"(\n)" for newline splits.

required
data_list list[str]

List of strings to be split and processed.

required
last_next str

Determines how to handle split parts. "next" places the split character at the beginning of the next part, "last" places it at the end of the current part. Defaults to "next".

'next'

Returns:

Type Description
list[str]

list[str]: New list of processed strings with empty strings filtered out.

Raises:

Type Description
error

If the split_pattern is not a valid regular expression.

Example

split_data_list(r"(\n)", ["line1\nline2", "line3\nline4"], "next") ['line1', 'line2', 'line3', 'line4']

Source code in pyformatjson/core/_base.py
def split_data_list(split_pattern: str, data_list: list[str], last_next: str = "next") -> list[str]:
    r"""Split data list according to the split pattern.

    This function splits each string in the data list using the provided regex pattern
    and reconstructs the data based on the last_next parameter. The pattern must use
    capturing parentheses to define split points.

    Args:
        split_pattern (str): Regular expression pattern for splitting. Must use capturing
            parentheses, e.g., r"(\n)" for newline splits.
        data_list (list[str]): List of strings to be split and processed.
        last_next (str, optional): Determines how to handle split parts. "next" places
            the split character at the beginning of the next part, "last" places it at
            the end of the current part. Defaults to "next".

    Returns:
        list[str]: New list of processed strings with empty strings filtered out.

    Raises:
        re.error: If the split_pattern is not a valid regular expression.

    Example:
        >>> split_data_list(r"(\n)", ["line1\nline2", "line3\nline4"], "next")
        ['line1', 'line2', 'line3', 'line4']
    """
    new_data_list = []
    for line in data_list:
        split_list = re.split(split_pattern, line)
        list_one = split_list[0 : len(split_list) : 2]
        list_two = split_list[1 : len(split_list) : 2]

        temp = []
        if last_next == "next":
            list_two.insert(0, "")
            temp = [list_two[i] + list_one[i] for i in range(len(list_one))]
        if last_next == "last":
            list_two.append("")
            temp = [list_one[i] + list_two[i] for i in range(len(list_one))]
        new_data_list.extend(temp)
    new_data_list = [line for line in new_data_list if line.strip()]
    return new_data_list

split_text_by_length

split_text_by_length(text, max_length=120)

Split text into lines of specified maximum length.

This function breaks long text into multiple lines, ensuring each line does not exceed the specified maximum length. It attempts to break at word boundaries when possible.

Parameters:

Name Type Description Default
text str

The input text to be split into lines.

required
max_length int

Maximum length for each line. Defaults to 120.

120

Returns:

Type Description
list[str]

list[str]: A list of text lines, each not exceeding max_length characters.

Example

split_text_by_length("This is a very long text that needs to be split", 20) ['This is a very long', 'text that needs to be', 'split']

Source code in pyformatjson/core/_base.py
def split_text_by_length(text, max_length=120) -> list[str]:
    """Split text into lines of specified maximum length.

    This function breaks long text into multiple lines, ensuring each line
    does not exceed the specified maximum length. It attempts to break at
    word boundaries when possible.

    Args:
        text (str): The input text to be split into lines.
        max_length (int, optional): Maximum length for each line. Defaults to 120.

    Returns:
        list[str]: A list of text lines, each not exceeding max_length characters.

    Example:
        >>> split_text_by_length("This is a very long text that needs to be split", 20)
        ['This is a very long', 'text that needs to be', 'split']
    """
    lines = []
    while text:
        if len(text) <= max_length:
            lines.append(text)
            break

        split_pos = text.rfind(" ", 0, max_length + 1)
        if split_pos == -1:
            split_pos = max_length

        line = text[:split_pos]
        lines.append(line)

        text = text[split_pos:]

    new_lines = []
    for line in lines:
        new_lines.append(line)
    return new_lines

standardize_path

standardize_path(path_input)

Standardize and ensure a directory path exists.

This function expands environment variables and user home directory references in the path, then creates the directory if it doesn't exist.

Parameters:

Name Type Description Default
path_input str

The input path to be standardized and created.

required

Returns:

Name Type Description
str str

The standardized absolute path.

Example

standardize_path("~/Documents/data") '/Users/username/Documents/data'

Source code in pyformatjson/core/_base.py
def standardize_path(path_input: str) -> str:
    """Standardize and ensure a directory path exists.

    This function expands environment variables and user home directory references
    in the path, then creates the directory if it doesn't exist.

    Args:
        path_input (str): The input path to be standardized and created.

    Returns:
        str: The standardized absolute path.

    Example:
        >>> standardize_path("~/Documents/data")
        '/Users/username/Documents/data'
    """
    path_input = os.path.expandvars(os.path.expanduser(path_input))
    if not os.path.exists(path_input):
        os.makedirs(path_input)
    return path_input