linkinglines package

Submodules

linkinglines.ClusterLines module

Created on Thu Apr 1 13:12:50 2021

@author: akh

linkinglines.ClusterLines.AggCluster(dikeset, dtheta, drho, dimensions=2, linkage='complete', rotate=False, metric='Euclidean')

Agglomerative clustering with custom metric on Hough transform data.

Parameters:
  • dikeset (pandas DataFrame) – DataFrame with Hough transform data.

  • dtheta (float) – Scaling factor for theta.

  • drho (float) – Scaling factor for rho.

  • dimensions (int, optional) – Number of dimensions to use for clustering (default is 2).

  • linkage (str, optional) – Linkage method for hierarchical clustering (default is ‘complete’).

  • rotate (bool, optional) – Whether to rotate the dataset (default is False).

  • metric (str, optional) – Metric to use for clustering (default is ‘Euclidean’).

Returns:

  • dikeset (pandas DataFrame) – DataFrame with cluster labels.

  • Z (numpy ndarray) – The hierarchical clustering linkage matrix.

linkinglines.ClusterLines.fullTree(model, **kwargs)

Generate and plot a full dendrogram for hierarchical clustering results.

Parameters:
Return type:

None

Note:

This function generates and plots a full dendrogram showing the hierarchical clustering of data based on the provided AgglomerativeClustering model. It calculates counts of samples under each node and creates a linkage matrix for the dendrogram.

Example:

fullTree(clustering_model, color_threshold=0.5)

linkinglines.DilationCalculations module

Created on Thu Jul 7 14:45:57 2022

@author: akh

linkinglines.DilationCalculations.TripleDilationPlot(df, lines, shape=['half', 'portrait'], kwargs=None)

Creates a triple-panel plot to visualize dilation results.

This function generates a visualization consisting of three panels: a main plot displaying line segments in Cartesian coordinates, a histogram of North-South (NS) dilation values below the main plot, and a histogram of East-West (EW) dilation values to the right of the main plot. It utilizes the dilation function to calculate dilation values based on the provided line segment data.

Parameters:
  • df (pandas.DataFrame) – A DataFrame containing data points to be visualized in the main panel.

  • lines (pandas.DataFrame) – A DataFrame containing line segment data used for dilation calculation.

  • shape (list, optional) – A list specifying the final figure size and orientation. The first element determines the figure’s size (‘full’, ‘half’, ‘quarter’), and the second element specifies the orientation (‘landscape’, ‘portrait’). Defaults to [‘half’, ‘portrait’].

  • kwargs (dict, optional) – Additional keyword arguments to pass to the dilation function for calculating dilation. This allows for customization of the dilation calculation, such as bin width and method. Defaults to None.

Returns:

  • fig (matplotlib.figure.Figure) – The figure object containing the generated triple-panel plot.

  • ax (list of matplotlib.axes._subplots.AxesSubplot) – A list containing the three axes objects for the main panel, NS dilation histogram, and EW dilation histogram.

Notes

  • The main panel plot displays line segments as provided in the lines DataFrame.

  • Dilation histograms help visualize the distribution of dilation values across NS and EW directions, providing insight into the stretching or expansion patterns in the dataset.

  • Customization options for the dilation calculation are provided via the kwargs parameter, allowing the user to adjust aspects such as bin width and calculation method according to their analysis needs.

linkinglines.DilationCalculations.dilation(df, binWidth=1.0, averageWidth=1.0, method='Expanded')

Calculates dilation values for a dataset along East-West (EW) and North-South (NS) directions.

Dilation is calculated by dividing the data into bins and computing the dilation for each bin based on the specified method. There are three methods available for calculating dilation: ‘Expanded’, ‘Average’, and ‘Total’. - ‘Expanded’ considers partial and fully contained line segments in each bin, calculating fractional expansions. - ‘Average’ calculates the average dilation of fully contained line segments within each bin. - ‘Total’ sums the dilation values of fully contained line segments within each bin.

Parameters:
  • df (pandas.DataFrame) – The input DataFrame containing data to calculate dilation. Must include columns for line segment coordinates.

  • binWidth (float, optional) – The width of bins used for dilation calculation along both EW and NS directions. Default is 1.0.

  • averageWidth (float, optional) – The average width of line segments used in the dilation calculation. Default is 1.0.

  • method ({'Expanded', 'Average', 'Total'}, optional) – The method used for calculating dilation: - ‘Expanded’: Considers partial and fully contained segments, calculating fractional expansions. - ‘Average’: Averages dilation values of fully contained segments within each bin. - ‘Total’: Sums dilation values of fully contained segments within each bin. Default is ‘Expanded’.

Returns:

  • EWDilation (numpy.ndarray) – An array of calculated dilation values along the East-West direction.

  • NSDilation (numpy.ndarray) – An array of calculated dilation values along the North-South direction.

  • binx (numpy.ndarray) – The bin edges along the East-West direction.

  • biny (numpy.ndarray) – The bin edges along the North-South direction.

Notes

The choice of method (‘Expanded’, ‘Average’, ‘Total’) depends on the specific requirements of the geological analysis. ‘Expanded’ provides detailed calculations including partial segments, ‘Average’ offers smoother dilation values, and ‘Total’ gives a cumulative measure of dilation within bins.

linkinglines.ExamineClusters module

examineMod: Module for examining and analyzing line clusters.

This module provides functions to analyze and examine line clusters, including computing bounding rectangles, evaluating cluster properties, and checking for changes between clusters.

Functions:
  • OutputRectangles(clusters): Compute the coordinates of bounding rectangles for each cluster.

  • examineCluster(clusters): Generate a summary of properties for each cluster in a set of line clusters.

  • examineClustersShort(clusters): Generate a summary of properties for clusters in a set of line clusters.

  • checkClusterChange(lines1, lines2): Check if two sets of line clusters are the same.

  • checkoutCluster(clusters,label): Display information and plots related to a cluster of lines.

  • checkoutClusterCart(clusters, label): Display information and plots related to a cluster of lines.in Cartesian space

  • checkoutby(dikeset, lines, col): plot information based on cluster metrics

  • RotateOverlap(lines):Calculate the overlap ratio and maximum overlap count of lines after rotation.

  • enEchelonTwistAngle(d, avgtheta) :Calculate the angle twist and statistical significance of features.

  • extendLines(lines):Extends lines in dataframe up to min(x)*.5 and max(x)*1.5

linkinglines.ExamineClusters.CheckoutBy(dikeset, lines, col, maximum=True, minimum=False)

Display cluster information and plots for lines with maximum or minimum values in a specified column.

This function allows you to analyze and visualize clusters of lines based on the maximum or minimum values in a specified column. It can display cluster information and plots for both maximum and minimum values independently.

Parameters ‘dikeset’ and ‘lines’ should contain the necessary line data, and ‘col’ should specify the column in ‘lines’ to determine maximum or minimum values. You can choose to display information for maximum values, minimum values, or both by setting the ‘maximum’ and ‘minimum’ flags accordingly.

The function utilizes the ‘checkoutCluster’ function to generate cluster information and plots and prints the label and the corresponding maximum or minimum value of the specified column.

Parameters:
  • dikeset (pandas.DataFrame) – A DataFrame containing line data with columns ‘xc’, ‘yc’, ‘Labels’, ‘theta’, and ‘rho’.

  • lines (panads.DataFrame) – A DataFrame containing line data with a ‘Label’ column and the specified ‘col’ column.

  • col (str) – The column name in ‘lines’ to identify the maximum or minimum values.

  • maximum (bool, default True) – If True, display information and plots for lines with the maximum ‘col’ value.

  • minimum (bool, default False) – If True, display information and plots for lines with the minimum ‘col’ value.

Returns:

  • fig (matplotlib.figure.Figure) – The Figure object for the generated plots.

  • ax (matplotlib.axes.Axes) – The Axes object for the generated plots.

Example

>>> fig, ax = ll.CheckoutBy(lines, dikeset, 'theta', maximum=True, minimum=True)
>>> plt.show()
linkinglines.ExamineClusters.OutputRectangles(clusters)

Compute the coordinates of bounding rectangles for each cluster in a set of line clusters. This function computes the coordinates of bounding rectangles for each cluster in a set of line clusters. It uses the ‘Xmid’ and ‘Ymid’ attributes of the clusters to determine the center points and calculates the coordinates of the corners of rectangles that enclose the clusters.

Parameters:

clusters (DataFrame) – A DataFrame containing line clusters with attributes ‘Labels’, ‘Xmid’, ‘Ymid’.

Returns:

  • Xs (numpy.ndarray) – An array of X-coordinates for the corners of bounding rectangles for each cluster.

  • Ys (numpy.ndarray) – An array of Y-coordinates for the corners of bounding rectangles for each cluster.

linkinglines.ExamineClusters.RotateOverlap(lines)

Calculate the overlap ratio and maximum overlap count of lines after rotation.

This function takes a DataFrame of line data and performs the following steps: 1. Calculates the mean angle (‘theta’) of the lines. 2. Rotates the line data by the complementary angle (90 degrees minus the mean angle). 3. Transforms the ‘Xstart’ values to ensure they are in the correct order. 4. Computes the total length of the line segments. 5. Determines the overlapping segments along the rotated x-axis and calculates the overlap ratio.

The overlap ratio is defined as the ratio of the length of overlapping segments to the total length of the segments. Higher ratios are more overlap

A ratio of 1 would indicate it has two segments that are totally overlapping while a ratio of 2 would indicate 3 segments totally overlapping.

The maximum overlap count represents the maximum number of overlapping segments along the rotated x-axis.

Parameters:

lines (DataFrame) – A DataFrame with columns ‘theta’ for the angle of each line segment, ‘Xstart’, and ‘Xend’ indicating the starting and ending points of the line segments along the x-axis.

Returns:

  • overlap_ratio (float) – The ratio of the total length of overlapping segments to the total length of all segments after rotation. Values closer to 1 indicate high overlap between two segments, while higher values indicate overlap among three or more segments.

  • max_overlap_count (int) – The maximum number of line segments overlapping along the rotated x-axis.

linkinglines.ExamineClusters.checkAllClusterChange(lines1, lines2)

Compare two sets of line clusters to check if they are the same.

This function compares two sets of line clusters to determine whether they are identical or different. It does this by sorting and hashing the ClusterHash values of both sets and then comparing the resulting hash values. If the hash values are the same, the clusters are considered the same; otherwise, they are considered different. Both must have “ClusterHash” as a column.

Parameters:
  • lines1 (pandas.DataFrame) – The first set of line clusters as a DataFrame.

  • lines2 (DataFrame) – The second set of line clusters as a DataFrame.

Returns:

(bool)

Return type:

True if the clusters are the same, False otherwise.

linkinglines.ExamineClusters.checkIndividualClusterChange(df1, df2)

Compare individual line clusters between two sets of data frames. This function compares individual line clusters between two sets of data frames, df1 and df2. It identifies which cluster labels are common (eqLabels) and which are unique to each data frame (diffLabels). It also provides information about the number of overlapping clusters and the total number of clusters in each data frame.

Parameters:
  • df1 (pandas.DataFrame) – The first set of line clusters as a DataFrame.

  • pandas.DataFrame (df2) – The second set of line clusters as a DataFrame.

Returns:

  • eqLabels (numpy.ndarray) – An array of labels that are found in both df1 and df2.

  • diffLabels (numpy.ndarray) – An array of labels that are unique to either df1 or df2.

linkinglines.ExamineClusters.checkoutCluster(dikeset, label)

Display information and plots related to a cluster of lines.

This function takes a DataFrame containing line data and a label specifying a cluster of lines. It generates two subplots: 1. Scatter plot of lines’ theta and rho values, highlighting the selected cluster in red. 2. Rectangle plot showing the lines in the cluster along with additional information.

The function calculates and displays the following information: - Mean angle (in degrees) of the lines in the cluster. - Mean rho (in km) of the lines in the cluster. - Length and width of the cluster. - Aspect ratio (length/width) of the cluster. - Size of the cluster (number of lines).

Additionally, it handles cases where the cluster crosses the zero angle boundary and adjusts the plot accordingly.

Parameters:
  • dikeset (pandas.DataFrame) – A DataFrame containing line data with columns ‘xc’, ‘yc’, ‘Labels’, ‘theta’, and ‘rho’.

  • label (int) – The label of the cluster to be analyzed.

Returns:

  • fig (matplotlib.figure.Figure) – The Figure object for the generated plots.

  • ax (matplotlib.axes.Axes) – The Axes object for the generated plots.

Example:

>>> fig, ax = ll.checkoutCluster(lines, 2)
>>> plt.show()
linkinglines.ExamineClusters.checkoutClusterCart(dikeset, label, fig, ax)

Visualize and annotate cluster information in a Cartesian coordinate system.

This function takes as input a pandas DataFrame containing line segment data, a specific cluster label, and matplotlib Figure and Axes objects for plotting. It outputs a visualization of the specified cluster’s line segments within a Cartesian coordinate system and annotates this visualization with the cluster’s mean angle, length, width, and size. Additionally, the function prints details of the cluster’s mean angle, rho mean, length, and width for further reference.

Parameters:
  • dikeset (DataFrame) – A DataFrame containing line data. Expected columns include ‘xc’ and ‘yc’ for Cartesian coordinates, ‘Labels’ for cluster labels, ‘theta’ for segment angles, and ‘rho’ for the distance from the origin to the line.

  • label (int) – The label identifying the cluster to be visualized and analyzed.

  • fig (matplotlib.figure.Figure) – The Figure object provided by matplotlib for plotting. It serves as the canvas on which the plot is drawn.

  • ax (matplotlib.axes.Axes) – The Axes object provided by matplotlib, representing the space within the figure where the data is plotted.

Returns:

ax – The modified Axes object with the cluster plot and annotations.

Return type:

matplotlib.axes.Axes

Examples

>>> fig, ax = plt.subplots()
>>> ax = ll.checkoutClusterCart(line_data, 2, fig, ax)
>>> plt.show()
linkinglines.ExamineClusters.enEchelonAngleTwist(d, avgtheta)

Calculate the angle twist and statistical significance for en échelon features.

This function is designed to analyze en échelon features, which are represented as line segments, to determine the angle twist and its statistical significance. The process includes fitting a linear regression model to the midpoints of the line features to estimate their orientation, computing the p-value for the linear regression to assess the alignment’s statistical significance, and calculating the angle twist based on the p-value. The angle twist is defined as the angle difference between the linear model’s orientation and the average angle of the line segments if the p-value indicates significant alignment. Otherwise, the angle twist is set to 0, suggesting no significant alignment.

Parameters:
  • d (pandas.DataFrame) – A DataFrame containing en échelon line segment data, which must include columns for ‘Xstart’, ‘Xend’, ‘Ystart’, ‘Yend’, ‘Xmid’, and ‘Ymid’. These represent the start and end points of the line segments and their midpoints, respectively.

  • avgtheta (float) – The average angle (in degrees) of the line segments, used for comparison with the estimated orientation of the en échelon feature.

Returns:

  • angle_twist (float) – The angle difference (in degrees) between the en échelon feature’s orientation and the average angle of the line segments. It reflects the degree of twist if the alignment is statistically significant, otherwise set to 0.

  • p_value (float) – The p-value from the linear regression model, indicating the statistical significance of the alignment between the line segments. Values below 0.05 suggest significant alignment.

  • Xstart (float) – The starting x-coordinate of the line representing the estimated orientation.

  • Xend (float) – The ending x-coordinate of the line representing the estimated orientation.

  • Ystart (float) – The y-coordinate at the starting x-coordinate of the estimated orientation line.

  • Yend (float) – The y-coordinate at the ending x-coordinate of the estimated orientation line.

Notes

The function assumes significant alignment if the p-value is below 0.05, following standard statistical significance levels.

linkinglines.ExamineClusters.evaluationOnClusters(clusters_data)

Calculate evaluation metrics based on cluster data.

This function computes various evaluation metrics based on the provided cluster data. The metrics include information about the number of clusters, cluster sizes, rho and theta range statistics, average lengths and widths of clusters, en échelon angle differences, and more.

The resulting summary DataFrame provides insights into the distribution and characteristics of the clusters, making it useful for further analysis and interpretation of the data.

Parameters:

clusters_data (pandas.DataFrame) – A DataFrame containing cluster information.

Returns:

evaluation – A summary DataFrame with evaluation metrics.

Return type:

pandas.DataFrame

linkinglines.ExamineClusters.examineClusterShort(clusters)

Analyze and summarize information about clustered line segments. This is the shorter and faster version of examineCluster

This function analyzes and summarizes information about clustered line segments. It calculates various statistics for each cluster, including the starting and ending coordinates, the average rho (distance from the origin), the average theta (angle), and the size (number of lines) within each cluster. Additionally, it computes the midpoint of each cluster and determines if it is a clustered or non-clustered line segment.

The function returns a DataFrame (‘clusters_data’) containing the summarized information for each cluster, including its label, coordinates, average rho, average theta, size, and a hash identifier.

Parameters:

clusters (pandas.DataFrame) – A DataFrame containing line data with columns ‘Labels’, ‘theta’, ‘rho’, and ‘HashID’.

Returns:

clusters_data – A DataFrame containing summarized information for each cluster.

Return type:

pandas.DataFrame

See also

examineCluster

A more detailed version of this function that provides additional information about the clusters.

linkinglines.ExamineClusters.examineClusters(clusters, enEchelonCutofff=7, ifEE=False, MaxNNSegDist=0.5, skipUnlinked=True, xc=None, yc=None)

Analyze and summarize information about clusters of line segments.

This function analyzes and summarizes information about clusters of line segments. It calculates various statistics for each cluster, including coordinates, average rho (distance from the origin), average theta (angle), size (number of lines) within each cluster, and other cluster-related information.

It also includes a TrustFilter based on the maximum normalized nearest neighbor segment distance. TrustFilter is calculated by first calculating all nearest neighbor distances between line segments in a cluster. The maximum, median, and minimum normalized by the total length of the clustered segments. A MaxNNSegDist is then used to filter out clusters with a maximum normalized nearest neighbor segment distance greater than the specified value. For if a cluster with 3 segments which are evenly distributed along the length, (one segment at the middle and two on either ends of a line), would have a MaxNNSegDist of 0.5. Whereas a cluster with 3 segments in which two are close together and one is far away would have a MaxNNSegDist greater than 0.5 and up to 1.0.

Parameters:
  • clusters (pandas.DataFrame) – A DataFrame containing line data with columns ‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’, ‘seg_length’, ‘ID’, ‘rho’, ‘theta’, ‘Labels’, and ‘PerpOffsetDist’.

  • enEchelonCutofff (int, default = 7) – A cutoff value for en échelon angle differences.

  • ifEE (bool default = False) – A flag indicating whether en échelon analysis should be performed.

  • MaxNNSegDist (float) – The maximum normalized nearest neighbor segment distance for the TrustFilter. This is a float (0,1).

  • skipUnlinked (bool default = True) – A flag to skip unlinked segments.

  • xc (float, optional) – The x-coordinate of the center

  • yc (float, optional) – The y-coordinate of the center

Returns:

  • clusters_data (pandas.DataFrame) – A DataFrame containing summarized information for each cluster.

    1. Label: Cluster label or identifier.

    2. Xstart: Starting x-coordinate of the clustered line.

    3. Ystart: Starting y-coordinate of the clustered line.

    4. Xend: Ending x-coordinate of the clustered line.

    5. Yend: Ending y-coordinate of the clustered line.

    6. X0: Midpoint of the x-coordinate range of the cluster.

    7. Y0: Midpoint of the y-coordinate range of the cluster.

    8. AvgRho: Average rho for the cluster.

    9. AvgTheta: Average angle (theta) for the cluster.

    10. AvgSlope: Average slope of the lines in the cluster.

    11. AvgIntercept: Average intercept of the lines in the cluster.

    12. RhoRange: Range of rho values within the cluster.

    13. Aspect: Aspect ratio, calculated as the length (l) divided by the width (w).

    14. Xmid: X-coordinate of the midpoint of the fitted rectangle

    15. Ymid: Y-coordinate of the midpoint of the fitted rectangle

    16. PerpOffsetDist: Average perpendicular offset distance for the lines in the cluster.

    17. PerpOffsetDistRange: Range of perpendicular offset distances within the cluster.

    18. NormPerpOffsetDist: Normalized perpendicular offset distance.

    19. ThetaRange: Range of theta values within the cluster.

    20. StdRho: Standard deviation of rho values within the cluster.

    21. StdTheta: Standard deviation of theta values within the cluster.

    22. R_Width: Width (w) of the cluster.

    23. R_Length: Length (l) of the cluster.

    24. Size: Number of lines in the cluster.

    25. R_error: Square root of the error (r) in the cluster’s line fit.

    26. Linked: Indicates whether the lines in the cluster are considered linked or not.

    27. SegmentLSum: Sum of the lengths of line segments within the cluster.

    28. ClusterHash: A hash identifier for the cluster.

    29. ClusterCrossesZero: Indicates whether the cluster’s angles cross zero.

    30. EnEchelonAngleDiff: Twist angle difference for features within the cluster.

    31. Overlap: Overlap of line segments within the cluster.

    32. nOverlapingSegments: Number of overlapping segments within the cluster.

    33. EEPvalue: P-value related to en échelon analysis.

    34. MaxSegNNDist: Maximum normalized nearest neighbor segment distance.

    35. MedianSegNNDist: Median normalized nearest neighbor segment distance.

    36. MinSegNNDist: Minimum normalized nearest neighbor segment distance.

    37. TrustFilter: A filter indicating trustworthiness based on the maximum normalized nearest neighbor segment distance. 38. ‘xc’: X-coordinate of HT origin 39. ‘yc’: Y-coordinate of HT origin 40: ‘Date_Changed’: date string of generation or change time

  • evaluation (pandas.DataFrame) – A DataFrame containing summary statistics of the clusters.

    1. nClusters: The number of clusters in the clusters_data DataFrame.

    2. nDikePackets: The number of clusters with an overlap greater than 0.1 (presumably indicating some form of overlap between line segments).

    3. AverageRhoRange: The average range of rho values within the clusters.

    4. MaxRhoRange: The maximum range of rho values within the clusters.

    5. StdRhoRange: The standard deviation of the range of rho values within the clusters.

    6. AverageThetaRange: The average range of theta values within the clusters.

    7. MaxThetaRange: The maximum range of theta values within the clusters.

    8. StdThetaRange: The standard deviation of the range of theta values within the clusters.

    9. AvgClusterSize: The average size (number of lines) of the clusters.

    10. ClusterSizeStd: The standard deviation of the size of the clusters.

    11. ClusterMax: The maximum size (number of lines) among the clusters.

    12. AverageL: The average length (l) of the clusters.

    13. MaxL: The maximum length (l) among the clusters.

    14. StdL: The standard deviation of the length (l) of the clusters.

    15. AverageW: The average width (w) of the clusters.

    16. MaxW: The maximum width (w) among the clusters.

    17. StdW: The standard deviation of the width (w) of the clusters.

    18. nTrustedDikes: The number of clusters that pass a trust filter (presumably based on some criteria).

    19. MaxEEAngleDiff: The maximum en échelon angle difference among the clusters.

    20. AverageEAngleDiff: The average en échelon angle difference among the clusters.

    21. Date: The date when this summary information was generated.

Example:

>>> import linkinglines as ll
>>> data = ll.readFile('data.csv')
>>> cluster_data, _ = ll.AggCluster(data, 0.5, 0.5)
>>> cluster_summary, cluster_evaluation = ll.examineClusters(cluster_data,
                                                      enEchelonCutofff=10,
                                                      ifEE=True,
                                                      MaxNNSegDist=0.6,
                                                      skipUnlinked=True)

See also

fit_Rec

A function to fit a rectangle to a cluster of line segments.

EnEchelonAngleTwist

A function to calculate the angle twist and statistical significance for en échelon features.

RotateOverlap

A function to calculate the overlap ratio and maximum overlap count of lines after rotation.

linkinglines.ExamineClusters.extendLines(lines, save=False, name='Longlines.csv')

Extends lines in dataframe up to min(x)*.5 and max(x)*1.5

Parameters:

df (pandas.DataFrame) – Dataframe containing line data with columns ‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’, ‘theta’, ‘rho’

Returns:

lines – extended lines dataframe

Return type:

pandas.dataframe

linkinglines.FitRadialCenters module

FitRadialCenters Module

This module provides functions for fitting a radial model to dike lines data, calculating radial azimuthal angles, measuring the “clumpiness” of radial dike swarm angles, and more.

Functions: - CenterFunc(theta, xr, yr, xc, yc): Calculates the radial distance of a point from a given center for a specified angle. - RadialFit(lines, plot=False, ColorBy=None, weight=’LayerNumber’, ThetaRange=[-90, 90], xc=None, yc=None): Fits a radial model to dike lines data, optionally visualizing the fit. It returns center information such as coordinates and goodness of fit. - RadialAzimuthal(lines, Center): Calculates radial azimuthal angles of dike lines relative to a specified center. - CyclicAngle360(u, v): Computes the cyclic angle between two angles in the range [0, 360). - AngleSpacing(rAngle): Calculates statistics related to the spacing between radial azimuthal angles. - RipleyRadial(rAngle, plot=False): Measures the “clumpiness” of radial dike swarm angles using the Ripley K function, with an option to visualize the results. - ExpandingR(lines, Center): Measures the density of lines at different radial distances from a center. - NearCenters(lines, Center, tol=10000, printOn=False): Identifies lines near a specified center within a tolerance, providing detailed information about those lines. - writeCenterWKT(df, name): Writes center information to a Well-Known Text (WKT) file, allowing for easy geospatial data export.

linkinglines.FitRadialCenters.AngleSpacing(rAngle)

Calculate statistics related to angle spacing in a set of radial azimuthal angles.

This function computes statistical measures related to the spacing between consecutive angles in a set of radial azimuthal angles. It calculates the mean, median, minimum, and maximum spacing between angles. These statistics provide insights into the distribution and variability of angle spacings within the dataset, which can be critical for understanding patterns or uniformity in radial distributions.

Parameters:

rAngle (numpy.ndarray) – An array of radial azimuthal angles, assumed to be in degrees. The angles should be sorted in ascending order for accurate spacing calculations.

Returns:

  • mean (float) – The mean (average) spacing between consecutive angles in the dataset.

  • median (float) – The median spacing between consecutive angles, representing the middle value when the spacings are sorted in ascending order.

  • min (float) – The minimum spacing between any two consecutive angles in the dataset, indicating the closest proximity between points.

  • max (float) – The maximum spacing between any two consecutive angles, indicating the farthest apart points.

  • See Also

  • ——–

  • nearCenters (Find lines near a given center within a specified tolerance.)

linkinglines.FitRadialCenters.CenterFunc(theta, xr, yr, xc, yc)

Calculate the radial distance of a point from a center given an angle.

This function computes the radial distance of a point (or points) from a specified center based on a given angle. The radial distance, rhoRadial, is determined from the cartesian coordinates of the point and the center, along with the angle in degrees between them. This calculation is useful for geometric analyses and transformations in a polar coordinate system.

Parameters:
  • theta (float or array_like) – Angle(s) in degrees between the point(s) and the horizontal axis.

  • xr (float) – X-coordinate of the point.

  • yr (float) – Y-coordinate of the point.

  • xc (float) – X-coordinate of the center.

  • yc (float) – Y-coordinate of the center.

Returns:

rhoRadial – The radial distance(s) of the point(s) from the center. If theta is a single float, rhoRadial will be a single float. If theta is an array_like, rhoRadial will be an ndarray of the same shape.

Return type:

float or ndarray

linkinglines.FitRadialCenters.CyclicAngle360(u, v)

Calculate the cyclic angle between two angles in the range [0, 360).

Parameters:
  • u (float or array_like) – First angle.

  • v (float or array_like) – Second angle.

Returns:

d – Cyclic angle between the two angles.

Return type:

float or array_like

linkinglines.FitRadialCenters.ExpandingR(lines, Center)

Measure the density of lines at different radial distances from a center.

Parameters:
Returns:

ntol – List of density values at different radial distances.

Return type:

list

linkinglines.FitRadialCenters.NearCenters(lines, Center, tol=10000, printOn=False)

Find lines near a given center within a specified tolerance.

This function identifies and extracts lines from a dataset that are within a certain distance (tolerance) from a specified center point. The lines considered “near” are those whose distance from the center does not exceed the given tolerance. Optionally, the function can print details about these lines. The primary use of this function is in geospatial analyses, where identifying features relative to a point of interest is necessary, such as finding geological lines or faults near a geographic center.

Parameters:
  • lines (pandas.DataFrame) – A DataFrame containing the dike lines data. The DataFrame is expected to have columns that allow calculating the distance of each line from a specified center point.

  • Center (pandas.DataFrame) – A DataFrame containing the center information. This DataFrame should include at least the x and y coordinates of the center.

  • tol (float, optional) – The tolerance within which lines are considered near the center, in the same units as the line and center coordinates. Defaults to 10000 units.

  • printOn (bool, optional) – If True, prints information about the lines found near the center. This can include distances, line IDs, or any other relevant information contained in the lines DataFrame. Defaults to False.

Returns:

  • Close (pandas.DataFrame) – A DataFrame containing the subset of lines from the lines DataFrame that are within the specified tolerance of the center. The structure of this DataFrame mirrors that of lines.

  • Center (pandas.DataFrame) – The unchanged DataFrame containing center information as provided in the input. This is returned to maintain consistency in function outputs and facilitate further processing if needed.

  • See Also

  • ——–

  • CenterFunc (Calculate the radial distance of a point from a given center for a specified angle.)

  • RadialAzimuthal (Calculate radial azimuthal angles of dike lines relative to a specified center.)

  • ExpandingR (Measure the density of lines at different radial distances from a center.)

  • RipleyRadial (Measure the “clumpiness” of radial dike swarm angles using the Ripley K function.)

linkinglines.FitRadialCenters.RadialAzimuthal(lines, Center)

Calculate the radial azimuthal angles of dike lines relative to a given center.

In a radial structure of lines, the hough transform does not distinguish between lines on either side of the center. This function calculates the radial azimuthal angles of the lines relative to a given center. Which assigns lines based on their angle relative to the center.

Parameters:
Returns:

rAngle – Array of radial azimuthal angles.

Return type:

numpy.array

linkinglines.FitRadialCenters.RadialFit(lines, plot=False, ColorBy=None, weight='LayerNumber', ThetaRange=[-90, 90], xc=None, yc=None)

Fit a radial model to dike lines data.

This function fits a radial model to a dataset containing dike lines, optionally plots the fitted model, and returns a DataFrame with center information. The fitting process considers dike line orientations and positions to determine the most representative central point. Users can specify parameters for plotting, including whether to plot, how to color the lines, what weights to use, and the range of angles to consider for the fit. The function also allows specifying the center coordinates explicitly.

Parameters:
  • lines (pandas.DataFrame) – A DataFrame containing the dike lines data. Expected to include columns that match the ColorBy and weight parameters if they are used.

  • plot (bool, optional) – If True, the fitted model and the lines will be plotted. Defaults to False.

  • ColorBy (str, optional) – The name of the column in lines DataFrame to use for coloring the lines in the plot. If None, no coloring is applied. Defaults to None.

  • weight (str, optional) – The name of the column in lines DataFrame to use for weighting in the plot. This can be used to emphasize certain lines over others. Defaults to ‘LayerNumber’.

  • ThetaRange (list of float, optional) – A list specifying the range of angles in degrees to consider for the fit, in the format [min, max]. This can be used to limit the analysis to a certain orientation of dike lines. Defaults to [-90, 90].

  • xc (float, optional) – The x-coordinate of the center to be used in the fit. If None, the function attempts to calculate the center automatically. Defaults to None.

  • yc (float, optional) – The y-coordinate of the center to be used in the fit. If None, the function attempts to calculate the center automatically. Defaults to None.

Returns:

Centers – A DataFrame containing the information about the calculated center(s), including coordinates and possibly other metrics derived from the fitting process.

Return type:

pandas.DataFrame

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from linkinglines.FitRadialCenters import RadialFit
>>> lines = pd.DataFrame({
...     'Theta': np.linspace(-90, 90, 100),
...     'Rho': np.linspace(0, 1, 100)
... })
>>> Centers = RadialFit(lines, plot=True, ColorBy='LayerNumber', weight='LayerNumber',
...                     ThetaRange=[-90, 90], xc=None, yc=None)
>>> print(Centers)
linkinglines.FitRadialCenters.RipleyRadial(rAngle, plot=False)

Measure the radial dike swarm angle “clumpiness” using the Ripley K function.

Parameters:
  • rAngle (numpy.array) – Array of radial azimuthal angles.

  • plot (bool, default False) – Whether to plot the Ripley K function. Defaults to False.

Returns:

  • K (numpy.array) – Ripley K function values.

  • K_se (float) – Standard error of the Ripley K function.

    if plot:
    fgmatplotlib.figure.Figure

    The figure object.

    axmatplotlib.axes.Axes

    The axes object.

  • See Also

  • ——–

  • AngleSpacing (Calculate statistics related to angle spacing in a set of radial azimuthal angles.)

  • NearCenters (Find lines near a given center within a specified tolerance.)

linkinglines.FitRadialCenters.writeCenterWKT(df, name)

Write center information to a geospatial data file. Can be “csv” or “shp” or “geojson”.

Parameters:
  • df (pandas.DataFrame) – Dataframe containing center information.

  • name (str) – File name to save the geospacial data.

Returns:

df – Dataframe containing center information.

Return type:

pandas.DataFrame

linkinglines.FitRectangle module

Created on Wed Feb 10 13:30:34 2021

@author: akh

fitRectangle: A Module for Fitting Rectangles to Rotated Line Segments

The ‘fitRectangle’ module provides functions for rotating line segments and fitting rectangles around them. It includes utilities for calculating rectangle properties and performing operations on line segments.

Functions: - rotateXYShift(ang, x, y, h, k): Rotate and shift coordinates (x, y) about a center point (h, k) by an angle ‘ang’. - unrotate(ang, x, y, h, k): Reverse the rotation and shift of coordinates (x, y) about a center point (h, k) by an angle ‘ang’. - endpoints(lines): Extract and return the x and y coordinates of endpoints from a DataFrame of line segments. - midpoint(lines): Calculate and return the x and y coordinates of midpoints for line segments in a DataFrame. - allpoints(lines): Calculate and return all the x and y coordinates along the line segments in a DataFrame. - fit_Rec(lines, xc, yc): Fit a rectangle to a cluster of lines and return its width, length, correlation coefficient, and coordinates. - RecEdges(xi, yi, avgtheta, x0, y0): Calculate the coordinates of the edges of a rectangle based on parameters. - pltLine(lines, xc, yc, ax): Plot a line representing the fitted rectangle on a specified matplotlib axis. - W_L(Clusters): Calculate the widths and lengths of rectangles fitted to clusters of lines. - squaresError(lines, xc, yc): Calculate the sum of squared errors for a cluster of lines fitted to a rectangle.

linkinglines.FitRectangle.RecEdges(xi, yi, avgtheta, x0, y0)

Calculates the coordinates of the edges of a rectangle based on parameters.

Parameters:
  • xi (numpy.ndarray) – Array of x-coordinates of line endpoints.

  • yi (numpy.ndarray) – Array of y-coordinates of line endpoints.

  • avgtheta (float) – The average angle of the lines.

  • x0 (float) – x-coordinate of the rectangle’s center.

  • y0 (float) – y-coordinate of the rectangle’s center.

Returns:

  • xs (numpy.ndarray) – Array containing x coordinates of the edges of the rectangle.

  • ys (numpy.ndarray) – Array containing y coordinates of the edges of the rectangle.

linkinglines.FitRectangle.W_L(Clusters)

Calculates the widths and lengths of rectangles fitted to clusters of lines.

Parameters:

Clusters (pandas.DataFrame) – A DataFrame containing clusters of lines with columns [‘Labels’, ‘theta’, ‘rho’, ‘seg_length’].

Returns:

  • width (numpy.ndarray) – Array containing widths of fitted rectangles for each cluster.

  • length (numpy.ndarray) – Array containing lengths of fitted rectangles for each cluster.

linkinglines.FitRectangle.allpoints(lines)

Calculate and return all the x and y coordinates along the line segments in a DataFrame.

This function calculates all the x and y coordinates that lie along the line segments in the DataFrame ‘lines’. It evenly samples points along each line segment using linear interpolation.

Parameters:

lines (pandas.DataFrame) – A DataFrame containing line segments with columns [‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’].

Returns:

  • xlist (numpy.ndarray) – A NumPy array containing the x coordinates of endpoints.

  • ylist (numpy.ndarray) – A NumPy array containing the y coordinates of endpoints.

linkinglines.FitRectangle.endpoints(lines)

Extracts and returns the x and y coordinates of endpoints from a DataFrame of line segments.

Parameters:

lines (pandas.DataFrame) – A DataFrame containing line segments with columns [‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’].

Returns:

  • xlist (numpy.ndarray) – A NumPy array containing the x coordinates of endpoints.

  • ylist (numpy.ndarray) – A NumPy array containing the y coordinates of endpoints.

linkinglines.FitRectangle.fit_Rec(lines, xc, yc)

Fits a rectangle to a cluster of lines and returns its width, length, correlation coefficient, and coordinates.

Parameters:
  • lines (pandas.DataFrame) – A DataFrame containing line segments with columns [‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’, ‘theta’, ‘rho’].

  • xc (float) – x-coordinate of the center of the rectangle.

  • yc (float) – y-coordinate of the center of the rectangle.

Returns:

  • width (float) – The width of the fitted rectangle.

  • length (float) – The length of the fitted rectangle.

  • r (float) – The correlation coefficient of the fitted rectangle.

  • xs (numpy.ndarray) – Array containing x coordinates of the edges of the fitted rectangle.

  • ys (numpy.ndarray) – Array containing y coordinates of the edges of the fitted rectangle.

  • Xmid (float) – The x-coordinate of the midpoint of the fitted rectangle.

  • Ymid (float) – The y-coordinate of the midpoint of the fitted rectangle.

linkinglines.FitRectangle.midpoint(lines)

Calculate and return the x and y coordinates of midpoints for line segments in a DataFrame.

Parameters:

lines (pandas.DataFrame) – A DataFrame containing line segments with columns [‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’].

Returns:

  • xlist (numpy.ndarray) – A NumPy array containing the x coordinates of endpoints.

  • ylist (numpy.ndarray) – A NumPy array containing the y coordinates of endpoints.

linkinglines.FitRectangle.pltLine(lines, xc, yc, ax)

Plots a line representing the fitted rectangle on an existing matplotlib axis.

Parameters:
  • lines (pandas.DataFrame) – A DataFrame containing line segments with columns [‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’, ‘theta’, ‘rho’].

  • xc (float) – x-coordinate of the center of the rectangle.

  • yc (float) – y-coordinate of the center of the rectangle.

  • ax (matplotlib.axes.Axes) – The axis on which the line will be plotted.

Return type:

None

linkinglines.FitRectangle.rotateXYShift(ang, x, y, h, k)

Rotate and shift coordinates (x, y) about a center point (h, k) by an angle ‘ang’ (in radians).

Parameters:
  • ang (float) – The angle in radians by which to rotate the coordinates.

  • x (float or numpy array) – The x-coordinate of the point to be transformed.

  • y (float or numpy array) – The y-coordinate of the point to be transformed.

  • h (float) – The x-coordinate of the center point about which the rotation and shift will be performed.

  • k (float) – The y-coordinate of the center point about which the rotation and shift will be performed.

Returns:

  • xp (float or numpy array) – The transformed x-coordinate after rotating and shifting (x, y) about (h, k).

  • yp (float or numpy array) – The transformed y-coordinate after rotating and shifting (x, y) about (h, k).

Notes

The function rotates the point (x, y) counterclockwise by ‘ang’ radians around the center point (h, k).

linkinglines.FitRectangle.squaresError(lines, xc, yc)

Calculates the sum of squared errors for a cluster of lines fitted to a rectangle.

Parameters:
  • lines (pandas.DataFrame) – A DataFrame containing line segments with columns [‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’, ‘theta’, ‘seg_length’].

  • xc (float) – x-coordinate of the center of the rectangle.

  • yc (float) – y-coordinate of the center of the rectangle.

Returns:

r – The sum of squared errors.

Return type:

float

linkinglines.FitRectangle.unrotate(ang, x, y, h, k)

Reverse the rotation and shift of coordinates (x, y) about a center point (h, k) by an angle ‘ang’ (in radians).

Parameters:
  • ang (float) – The angle in radians by which the coordinates were previously rotated.

  • x (float or numpy array) – The x-coordinate of the transformed point.

  • y (float or numpy array) – The y-coordinate of the transformed point.

  • h (float) – The x-coordinate of the center point about which the reverse rotation and shift will be performed.

  • k (float) – The y-coordinate of the center point about which the reverse rotation and shift will be performed.

Returns:

The original coordinates (xr, yr) before the rotation and shift were applied.

Return type:

float, float or numpy array, numpy array

Notes

  • The function reverses the rotation and shift applied to the point (x, y) by ‘ang’ radians around the center point (h, k).

  • It returns the original coordinates (xr, yr) before the transformation.

See also

rotateXYShift

Rotate and shift coordinates (x, y) about a center point (h, k) by an angle ‘ang’.

linkinglines.HT module

htMOD Module

HT: provides functions for working with line segments and performing Hough Transform-related calculations.

The Hough Transform (HT) is a feature extraction technique used in image analysis and computer vision to detect lines. The Hough Transform algorithm works by transforming lines from Cartesian space to a “slope-intercept” space, where each line is represented by a point in the new space. We use the algorithm from Ballard, D.H. (1981). Generalizing the Hough Transform to Detect Arbitrary Shapes. Pattern Recognition, 13(2), 111-122.

\[\rho = x \cos(\theta) + y \sin(\theta)\]

where \(\rho\) is the perpendicular distance from the origin to the line, and \(\theta\) is the angle between the x-axis and the line.

\[\theta = \arctan(-1/m)\]

where “m” is the slope of the line.

Functions:
  • CyclicAngleDistance(u,v): calculates smallest distance between angles

  • HoughTransform(df, xc=None, yc=None): calculates Hough Transform

  • HT_center(df): Calculates only HT center cartesian coordinates

  • rotateData(df, rotation_angle): rotates cartesian dataframe by rotation_angle

  • MidtoPerpDistance(df, xc, yc): calculates mid to perp distance

  • moveHTcenter(df, xc=None, yc=None): moves HT center and recalcultes

linkinglines.HT.CyclicAngleDist(u, v)

Calculate the cyclic angle distance between two angles in degrees.

This function calculates the cyclic angle distance between two angles in degrees, considering the cyclical nature of angles. The result represents the smallest angle difference between the two input angles, ranging from 0 to 90 degrees.

Parameters:
  • u (list or array of floats) – The first angle(s) in degrees.

  • v (list or array of floats) – The second angle(s) in degrees.

Returns:

dist – The cyclic angle distance between the two angles, ranging from 0 to 90 degrees.

Return type:

list or array of floats

Example

>>> angle1 = [45.0]
>>> angle2 = [160.0]
>>> distance = CyclicAngleDist(angle1, angle2)
>>> print("Cyclic Angle Distance (degrees):", distance)
linkinglines.HT.HT_center(df)

Finds the center of a dataframe of line segments.

Parameters:

df (pandas.Dataframe) – dataframe of the line segments must contain [“Xstart”, “Ystart”, “Xend”, “Yend”]

Returns:

  • xc (float) – x location of the HT center.

  • yc (float) – y location of the HT center.

linkinglines.HT.HoughTransform(df, xc=None, yc=None)

Calculates the Hough Transform of a dataframe of line segments.

Parameters:
  • df (pandas.Dataframe) – dataframe of the line segments must contain [“Xstart”, “Ystart”, “Xend”, “Yend”]

  • xc (float, optional) – x location of the HT center. If none is given, the center is calculated from the dataframe.

  • yc (float, optional) – y location of the HT center. If none is given, the center is calculated from the dataframe.

Returns:

newdf – line segments with new columns of [‘theta’, ‘rho’, ‘xc’, ‘yc’]

Return type:

pandas.Dataframe

linkinglines.HT.MidtoPerpDistance(df, xc=None, yc=None)

Find the distance between line segment midpoint and rho line perpendicular intersection.

Parameters:
  • df (pandas.Dataframe) – dataframe of the line segments must contain [“Xstart”, “Ystart”, “Xend”, “Yend”]

  • xc (float, optional) – x location of the HT center. If none is given, the center is calculated from the dataframe.

  • yc (float, optional) – y location of the HT center. If none is given, the center is calculated from the dataframe.

Returns:

  • df (pandas.Dataframe)

  • with new columns of [‘PerpOffDist’, ‘PerpIntX’, ‘PerpIntY’]

linkinglines.HT.moveHTcenter(df, xc=None, yc=None)

Move the center of a DataFrame of line segments to new coordinates (xc, yc).

Parameters:
  • df (pandas.DataFrame) – A DataFrame containing line segments with columns [‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’].

  • xc (float, optional) – The new x and y coordinates of the center of the line segments. If ‘xc’ and ‘yc’ are not provided, the center is calculated from the input DataFrame.

  • yc (float, optional) – The new x and y coordinates of the center of the line segments. If ‘xc’ and ‘yc’ are not provided, the center is calculated from the input DataFrame.

Returns:

A new DataFrame with line segments adjusted to the new center coordinates (xc, yc).

Return type:

pandas.DataFrame

Notes

  • This function takes a DataFrame of line segments and moves their center to the specified coordinates (xc, yc).

  • If ‘xc’ and ‘yc’ are not provided, the center is calculated based on the input DataFrame.

  • The resulting DataFrame ‘df2’ contains line segments with updated coordinates.

Example

>>> df = pd.DataFrame({'Xstart': [0, 1, 2], 'Ystart': [0, 1, 2], 'Xend': [1, 2, 3], 'Yend': [1, 2, 3]})
>>> new_df = moveHTcenter(df, xc=2.0, yc=2.0)
>>> print(new_df)
   Xstart  Ystart  Xend  Yend
0    -2.0    -2.0  -1.0  -1.0
1    -1.0    -1.0   0.0   0.0
2     0.0     0.0   1.0   1.0
linkinglines.HT.rotateData(df, rotation_angle, xc=None, yc=None)

Rotates a dataframe of line segments by a given angle.

Parameters:
  • df (pandas.Dataframe) – line segments dataframe

  • rotation_angle (float) – angle of rotation in degrees

  • xc (float, optional) – x location of the HT center. If none is given, the center is calculated from the dataframe.

  • yc (float, optional) – y location of the HT center. If none is given, the center is calculated from the dataframe.

Returns:

dfRotated – dataframe of the line segments rotated by the given angle

Return type:

pandas.Dataframe

linkinglines.HT.segLength(df)

Computes and adds a ‘seg_length’ column to a DataFrame, representing the length of line segments.

Parameters:

df (pandas.DataFrame) – The input DataFrame containing line data with ‘Xstart’, ‘Xend’, ‘Ystart’, and ‘Yend’ columns.

Returns:

df – The input DataFrame with an additional ‘seg_length’ column representing the length of line segments.

Return type:

pandas.DataFrame

linkinglines.PlotUtils module

Created on Thu Apr 1 12:49:51 2021

@author: akh

linkinglines.PlotUtils.AngleHistograms(dikeset, lines, ax=None, fig=None, Trusted=True, Annotate=False)

Plot histograms of angles for dike segments and clusters.

This function creates histograms to compare the distribution of angles for dike segments and clusters, including an optional subset for ‘trusted’ clusters. It can also annotate the histogram with lines at specific angles if desired.

Parameters:
  • dikeset (pandas.DataFrame) – DataFrame containing the dike segments data, with a column ‘theta’ for their angles.

  • lines (pandas.DataFrame) – DataFrame containing the clusters data, with columns ‘AvgTheta’ for their average angles and optionally ‘TrustFilter’ to indicate trusted clusters.

  • ax (matplotlib.axes._axes.Axes, optional) – Axes object on which to plot the histograms. If None, a new figure and axes are created. Default is None.

  • fig (matplotlib.figure.Figure, optional) – Figure object for the plot. Only used if ax is None, in which case a new figure is created. Default is None.

  • Trusted (bool, optional) – If True, includes a histogram for clusters marked as trusted based on the ‘TrustFilter’ column. Default is True.

  • Annotate (bool, optional) – If True, adds annotation lines at specific angles to the histogram for visual reference. Default is False.

Returns:

The axes object used for the plot.

Return type:

matplotlib.axes._axes.Axes

Example

>>> AngleHistograms(dikeset=df_segments, lines=df_clusters, Trusted=True, Annotate=True)
    Plots histograms of angles for dike segments and clusters, including 'trusted' clusters, with annotations.

Note

  • The dikeset DataFrame must contain a ‘theta’ column representing the angles of dike segments.

  • The lines DataFrame must contain an ‘AvgTheta’ column for the average angles of clusters and can optionally contain a ‘TrustFilter’ boolean column to filter for trusted clusters.

linkinglines.PlotUtils.DotsHT(fig, ax, lines, color=None, ColorBy='Dike Cluster Length (km)', label=None, cmap=<matplotlib.colors.ListedColormap object>, marker='o', rhoScale=True, Cbar=True, title=None, CbarLabels=True, axlabels=(True, True), StrOn=True, palette=None, alpha=0.4)

Create a scatter plot of rho and theta on a polar plot with customizable attributes.

Parameters:
  • fig (matplotlib.figure.Figure) – The figure object to which the plot is added.

  • ax (matplotlib.axes._subplots.PolarAxes) – The polar axes object on which to create the scatter plot.

  • lines (pandas.DataFrame) – A DataFrame containing the data points to be plotted, with columns for rho and theta.

  • color (str or None, optional) – The color of the data points. Default is None, which uses the default color.

  • ColorBy (str, optional) – The name of the DataFrame column used to determine the color of the data points based on its values. Default is “Dike Cluster Length (km)”.

  • label (str or None, optional) – The label for the colorbar. Default is None, which uses the column name specified by ColorBy.

  • cmap (matplotlib.colors.Colormap, optional) – The colormap used for coloring data points based on ColorBy. Default is cm.turbo.

  • marker (str, optional) – The marker style for the data points. Default is ‘o’.

  • rhoScale (bool, optional) – If True, scales the rho values by dividing by 1000 to display in kilometers. Default is True.

  • Cbar (bool, optional) – If True, displays a colorbar when coloring data points based on ColorBy. Default is True.

  • title (str or None, optional) – The title of the scatter plot. Default is None.

  • CbarLabels (bool, optional) – If True, displays tick labels on the colorbar. Default is True.

  • axlabels (tuple of bool, optional) – Specifies whether to display labels for theta and rho axes. Default is (True, True).

  • StrOn (bool, optional) – If True and ColorBy values are strings, enables string-based coloring. Default is True.

  • palette (str or None, optional) – The name of the color palette to use for string-based coloring. Default is None.

  • alpha (float, optional) – The transparency level of the data points. Default is 0.4.

Returns:

  • fig (matplotlib.figure.Figure) – The figure object used for the plot.

  • ax (matplotlib.axes._subplots.PolarAxes) – The polar axes object used for the plot.

linkinglines.PlotUtils.DotsLines(lines, ColorBy='seg_length', cmap=<matplotlib.colors.ListedColormap object>, linewidth=1, fig=None, ax=None, Cbar=True, CbarLabels=True, StrOn=False, color=None)

Plot line segments and their corresponding Hough Transform scatter plot side by side.

Parameters:
  • lines (pandas.DataFrame) – DataFrame containing data for line segments to be plotted.

  • ColorBy (str, optional) – Column name in lines DataFrame to color the scatter plot points based on its values. Default is “seg_length”.

  • cmap (matplotlib.colors.Colormap, optional) – Colormap for the scatter plot points. Default is cm.turbo.

  • linewidth (int, optional) – Line width for the line segments plot. Default is 1.

  • fig (matplotlib.figure.Figure or None, optional) – The figure object for the plots. If None, a new figure is created. Default is None.

  • ax (list of matplotlib.axes._subplots.PolarAxes or None, optional) – List of two polar axes objects for the plots. If None, new axes will be created. Default is None.

  • Cbar (bool, optional) – If True, displays a colorbar for the scatter plot. Default is True.

  • CbarLabels (bool, optional) – If True, displays labels on the colorbar. Default is True.

  • StrOn (bool, optional) – If True and ColorBy is a string, uses string-based coloring for the scatter plot. Default is False.

  • color (str or None, optional) – Color for the line segments. If None, uses default color. Default is None.

Returns:

  • matplotlib.figure.Figure – The figure object for the plots.

  • list of matplotlib.axes._subplots.PolarAxes – List of two polar axes objects for the plots.

Example

>>> fig, ax = DotsLines(lines_df)
linkinglines.PlotUtils.FixAxisAspect(ax1, ax2)

Adjust the aspect ratios of two matplotlib axes to match each other.

This function aligns the aspect ratios of two provided axes, ensuring that the graphical representation in both axes appears with consistent scaling. It’s particularly useful in comparative visualizations where matching scales and proportions are crucial.

Parameters:
  • ax1 (matplotlib.axes._subplots.AxesSubplot) – The first axis, to which the second axis’s aspect ratio will be adjusted.

  • ax2 (matplotlib.axes._subplots.AxesSubplot) – The second axis, which will be adjusted to match the aspect ratio of the first axis.

Examples

>>> import matplotlib.pyplot as plt
>>> fig, ax1 = plt.subplots()
>>> ax2 = plt.axes([0.2, 0.2, 0.4, 0.4])
>>> FixAxisAspect(ax1, ax2)
>>> plt.show()

Note

The aspect ratio adjustment is based on the size of the axes in inches and their positioning within the figure. This method may alter the ylim or xlim of ax2 to ensure that the aspect ratios match.

class linkinglines.PlotUtils.FixCartesianLabels(ax)

Bases: object

Class for adjusting axis labels by moving the offset of axis tick labels to the axis label.

This class is designed to assist in adjusting the display of axis labels in plots by incorporating the offset (scale factor) of the axis tick labels into the axis label itself. This adjustment is particularly useful in cases where the offset notation is preferred to be part of the axis label, enhancing the readability and presentation of plots.

update(ax, lim)

Updates the axis labels by incorporating the offset into the axis label text.

Examples

>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots()
>>> ax.plot([1, 2, 3], [1e6, 2e6, 3e6])
>>> y_label_fixer = FixCartesianLabels(ax.yaxis)
>>> plt.show()

Note

The update method is connected to the axis’s ‘Cartesian Plots Updated’ event and is triggered whenever the axis is updated, ensuring that the label adjustments are applied automatically.

update(ax, lim)

Updates the axis labels by moving the offset to the axis label.

This method adjusts the axis labels by incorporating the offset (scale factor) from the axis tick labels into the axis label itself. The update is applied to both the x and y axes.

Parameters:
  • ax (matplotlib.axis.Axis) – The axis for which the labels are to be updated.

  • lim (unused) – An unused parameter, included for compatibility with callback requirements.

Return type:

None

linkinglines.PlotUtils.HThist(lines, rstep, tstep, weights=None, fig=None, ax=None, rbins=None, tbins=None, cmap=<matplotlib.colors.LinearSegmentedColormap object>, gamma=0.3)

Plot a 2D histogram (Hough Transform space) of line data.

This function creates a 2D histogram representing the Hough Transform space of a set of lines. It visualizes the distribution of lines’ orientations (theta) and distances (rho) from the origin. The histogram is plotted with specified steps for rho and theta, and can optionally weight the lines. The plot can be customized with matplotlib’s figure and axes objects, bin sizes, colormap, and a gamma correction for the colormap normalization.

Parameters:
  • lines (pandas.DataFrame or array-like) – Data containing line information. Must have columns or keys for ‘rho’ and ‘theta’, representing the distance and orientation of lines, respectively.

  • rstep (float) – Step size for the rho (distance) dimension of the histogram.

  • tstep (float) – Step size for the theta (orientation) dimension of the histogram.

  • weights (array-like, optional) – Weights for each line, affecting how they contribute to the histogram. Default is None.

  • fig (matplotlib.figure.Figure, optional) – Figure object on which to plot. If None, a new figure is created. Default is None.

  • ax (matplotlib.axes.Axes, optional) – Axes object on which to plot. If None, new axes are created on the figure. Default is None.

  • rbins (array-like, optional) – Bin edges for rho (distance) dimension. If None, bins are created based on rstep. Default is None.

  • tbins (array-like, optional) – Bin edges for theta (orientation) dimension. If None, bins are created based on tstep. Default is None.

  • cmap (matplotlib.colors.Colormap, optional) – Colormap for the histogram. Default is matplotlib.cm.Blues.

  • gamma (float, optional) – Gamma correction for the colormap normalization. Default is 0.3.

Returns:

  • fig (matplotlib.figure.Figure) – The figure object used for the plot.

  • ax (matplotlib.axes.Axes) – The axes object used for the plot.

  • histogram_data (list) – A list containing the histogram array (h), bin edges for x (xe), bin edges for y (ye), and the QuadMesh object (c).

Example

>>> lines = pd.DataFrame({'rho': np.random.rand(100) * 200, 'theta': np.random.rand(100) * 180 - 90})
>>> fig, ax, hist_data = HThist(lines, 1, 10)
>>> plt.show()
linkinglines.PlotUtils.NumtoStringCoord(x, y)
linkinglines.PlotUtils.RGBArraytoHexArray(c)

Convert an array of RGB or RGBA values to an array of Hex values.

This function takes an array of RGB or RGBA tuples and converts them to an array of their corresponding hexadecimal color representations.

Parameters:

c (list) – A list of RGB or RGBA tuples.

Returns:

  • list

  • A list of hex strings in the form ‘#RRGGBB’ or ‘#RRGGBBAA’.

  • Example

  • ——-

  • >>> # Convert an array of RGB (0-1 range) to Hex

  • >>> rgb_colors = [(0.2, 0.5, 0.8), (0.9, 0.1, 0.4)]

  • >>> hex_colors = RGBArraytoHexArray(rgb_colors)

  • >>> print(“Hex Colors (”, hex_colors))

linkinglines.PlotUtils.RGBtoHex(vals, rgbtype=1)

Converts RGB values in a variety of formats to Hex values.

Parameters:
  • vals (tuple) – An RGB/RGBA tuple

  • rgbtype (int, default 1) –

    Valid valus are:

    1 - Inputs are in the range 0 to 1 256 - Inputs are in the range 0 to 255

Returns:

A hex string in the form ‘#RRGGBB’ or ‘#RRGGBBAA’

Return type:

list

linkinglines.PlotUtils.SetupAGUFig(finalSize, orientation, units='mm')

Set up a Matplotlib figure for creating a American Geophysical Union Journal style plot.

This function initializes a Matplotlib figure with specific dimensions and font settings suitable for JGR submission requirements. The figure size can be specified as a standard page size (quarter, half, full) or in custom dimensions. The function also adjusts global font sizes to ensure readability at the specified figure size.

Parameters:
  • finalSize (str or float or tuple or list) – The desired size of the figure. Can be ‘quarter’, ‘half’, ‘full’ for standard JGR sizes, a float indicating a fraction of the full page size, or a tuple/list specifying custom dimensions (width, height).

  • orientation (str) – The orientation of the figure, either ‘landscape’ or ‘portrait’.

  • units (str, optional) – The units for the custom dimensions (‘mm’, ‘cm’, ‘inches’), by default ‘mm’.

Returns:

fig – The initialized Matplotlib figure object with the specified dimensions and font settings.

Return type:

matplotlib.figure.Figure

Examples

>>> fig = SetupJGRFig('half', 'landscape')
>>> plt.plot(x, y)
>>> plt.xlabel('X Label')
>>> plt.ylabel('Y Label')
>>> plt.title('AGU-style Plot')
>>> plt.savefig('AGU_plot.png', dpi=300, bbox_inches='tight')
>>> plt.show()

Notes

  • This function adjusts the global matplotlib font settings which may affect other plots. Consider using matplotlib’s context manager (with plt.rc_context()) if this is a concern.

  • The specified figure size and orientation are intended to match AGU submission standards, ensuring that figures fit well within the page layout of the journal.

linkinglines.PlotUtils.StringCbar(c, fig, ax, values)

Create a colorbar for string-based categorical data.

This function generates a colorbar specifically for visualizing categorical data that has been encoded with colors through a mapping process. It takes as input the collection of plot elements colored according to their category, the figure and axis objects containing the plot, and a list of the categorical values. The function then produces a colorbar that reflects this categorical color mapping, facilitating the interpretation of the plot’s color scheme in relation to the categorical data.

Parameters:
  • c (matplotlib.collections.Collection) – The collection of colored elements in the plot, such as scatter plot points, which have been colored based on categorical data.

  • fig (matplotlib.figure.Figure) – The figure object containing the plot. This is used to ensure that the colorbar is correctly sized and positioned within the plot.

  • ax (matplotlib.axes.Axes) – The axes object on which the plot and the colorbar will be drawn. This specifies the plotting area for both the main plot and the associated colorbar.

  • values (list of str) – A list of strings representing the categories in the categorical data. Each category is associated with a specific color in the plot.

Returns:

colorbar – The colorbar object created for the categorical data. This colorbar shows the color associated with each category, facilitating the interpretation of the plot.

Return type:

matplotlib.colorbar.Colorbar

Examples

>>> # Assuming StringColors function has been used to map 'values' to colors
>>> values = ["Category A", "Category B", "Category C", "Category A"]
>>> color_indices, colormap = StringColors(values, palette="viridis")
>>> scatter = ax.scatter(x, y, c=color_indices, cmap=colormap)
>>> colorbar = StringCbar(scatter, fig, ax, values)
>>> plt.show()

Note

This function is designed to work with plots where categorical data has been mapped to colors using a specific encoding function (e.g., StringColors). It assumes that such a mapping process has already been applied to create the input collection c.

linkinglines.PlotUtils.StringColors(values, palette='turbo')

Map a list of strings to colors using a specified color palette.

This function takes a list of strings and maps each unique string to a unique color from a specified color palette.

Parameters:
  • values (list) – A list of strings to be mapped to colors.

  • palette (str, default "turbo") – The name of the color palette to use.

Returns:

  • tuple – A tuple containing two elements: - color_idx (numpy.ndarray): An array of indices representing the colors for each string. - cm (matplotlib.colors.LinearSegmentedColormap): The colormap used for mapping the strings to colors.

  • Example

  • ——-

  • >>> # Map a list of categories to colors

  • >>> categories = [“Category A”, “Category B”, “Category C”, “Category A”]

  • >>> color_indices, colormap = StringColors(categories, palette=”viridis”)

  • >>> print(“Color Indices (”, color_indices))

linkinglines.PlotUtils.annotateWLines(ax, angles=None)

Annotate the given axis with lines at specified angles.

This function adds lines to a matplotlib axis at specified angles to serve as annotations. It is designed to work when the axis aspect ratio is set to equal. The lines extend from a point just above the top of the axis, across the plotting area, at the specified angles.

Parameters:
  • ax (matplotlib.axes._axes.Axes) – The axes object to which the annotation lines will be added.

  • angles (list of float, optional) – The angles in degrees at which to draw the lines. If None, default angles are used.

Note

The axis must have an equal aspect ratio for the lines to appear correctly oriented.

Example

>>> fig, ax = plt.subplots()
>>> ax.plot([0, 1], [0, 1])
>>> ax.set_aspect('equal')
>>> annotateWLines(ax, angles=[-70, -30, 0, 30, 70])
>>> plt.show()

The lines are drawn from a point just above the current top of the axis, extending across the plotting area.

linkinglines.PlotUtils.breakXaxis(xlim, numAxes=1)

function to break x axis into based in xlim based on matplotlib example https://matplotlib.org/stable/gallery/subplots_axes_and_figures/broken_axis.html

num axes cannot be greater than 13

input:

xlim: tuple of x limits nAxes: number of axes you wish to make with the same breakpoints

output:

fig: figure object ax: list of axes objects

linkinglines.PlotUtils.clustered_lines(xs, ys, theta, length, xmid=None, ymid=None)

Calculate the coordinates of two points to represent a line segment based on clustering.

This function computes the coordinates of two points defining a line segment, given a set of x and y coordinates, a specified angle, and a length. The line segment is designed to have one end clustered around a central point, defined by (xmid, ymid). If the central point is not provided, it is calculated as the mean of the input coordinates.

Parameters:
  • xs (array-like) – An array-like object containing x-coordinates of data points.

  • ys (array-like) – An array-like object containing y-coordinates of data points.

  • theta (float) – The angle of the line segment relative to the horizontal, in degrees.

  • length (float) – The length of the line segment.

  • xmid (float, optional) – The x-coordinate of the central point around which one end of the line segment is clustered. If None, the mean of xs is used.

  • ymid (float, optional) – The y-coordinate of the central point around which one end of the line segment is clustered. If None, the mean of ys is used.

Returns:

A tuple (x1, y1, x2, y2) representing the coordinates of the two endpoints of the line segment.

Return type:

tuple

Examples

>>> xs = [1, 2, 3, 4, 5]
>>> ys = [2, 3, 4, 5, 6]
>>> theta = 45  # Angle in degrees
>>> length = 3
>>> xmid, ymid = 3, 4  # Central point
>>> x1, x2, y1, y2 = clustered_lines(xs, ys, theta, length, xmid, ymid)
>>> print(f'Point 1: ({x1}, {y1})')
>>> print(f'Point 2: ({x2}, {y2})')

Notes

  • The central point (xmid, ymid) serves as the midpoint of the line segment if not otherwise specified.

linkinglines.PlotUtils.combinePlots(fig1, fig2, path)

Combine two figures into one and save

Parameters:
  • fig1 (matplotlib figure.Figure) – lefthand figure

  • fig2 (matplotlib figure.Figure) – righthand figure

  • path (string) – path to save combined figure

Return type:

None.

linkinglines.PlotUtils.fontItems(fig, ax)

Generate a list of items with fonts to change in a matplotlib figure.

This function iterates over a given figure and axes (or a list of axes) to compile a list of all text items whose font properties can be modified. This includes titles, axis labels, and tick labels for each axis provided. Additionally, if the figure contains a legend, the fonts of the legend’s texts are also included.

Parameters:
  • fig (matplotlib.figure.Figure) – The matplotlib figure object.

  • ax (matplotlib.axes.Axes or list of matplotlib.axes.Axes) – Single axis object or a list of axis objects contained within fig.

Returns:

fontItems – A list of matplotlib text objects for which the font properties can be changed. This includes titles, axis labels, tick labels, and legend texts within the figure and the specified axes.

Return type:

list

linkinglines.PlotUtils.get_aspect(ax)

Calculate the aspect ratio of a matplotlib axes.

This function calculates the aspect ratio of a given matplotlib axes, taking into account both the aspect ratio of the figure and the aspect ratio of the data displayed in the axes.

Parameters:

ax (matplotlib.axes._subplots.AxesSubplot) – The axes for which to calculate the aspect ratio.

Returns:

  • float – The aspect ratio of the axes.

  • Example – import matplotlib.pyplot as plt

    # Create a sample plot fig, ax = plt.subplots()

    # Calculate and print the aspect ratio aspect_ratio = get_aspect(ax) print(“Aspect Ratio:”, aspect_ratio)

linkinglines.PlotUtils.get_ax_size_inches(ax)

Calculate the size of a matplotlib axis in inches.

This function determines the size (width and height) of a specified matplotlib axis in inches, taking into account the current figure’s DPI settings. It’s useful for precise layout adjustments or when needing to scale other plot elements relative to the axis size.

Parameters:

ax (matplotlib.axes._subplots.AxesSubplot) – The axis for which to calculate the size.

Returns:

  • width (float) – The width of the axis in inches.

  • height (float) – The height of the axis in inches.

Examples

>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots()
>>> width, height = get_ax_size_inches(ax)
>>> print(f"Width: {width} inches, Height: {height} inches")
linkinglines.PlotUtils.identify_axes(ax_dict, fontsize=12, **kwargs)

Helper to identify the Axes in the examples below.

Draws the label in a large font in the center of the Axes.

Parameters:
  • ax_dict (dict[str, Axes]) – Mapping between the title / label and the Axes.

  • fontsize (int, optional) – How big the label should be

  • **kwargs (dict) – Additional arguments are passed to ax.text.

Return type:

None

linkinglines.PlotUtils.jgrSize(fig, ax, finalSize, units='mm')

Resize a matplotlib figure to a Journal of Geophysical Research (JGR) appropriate size and set dpi to 600.

This function adjusts the size of a matplotlib figure to conform to the specifications for JGR submissions, such as “quarter”, “half”, “full”, or a specific size given by the user. It also sets the figure’s DPI to 600, ensuring high-resolution output suitable for publication.

Parameters:
  • fig (matplotlib.figure.Figure) – The matplotlib figure object to be resized.

  • ax (matplotlib.axes.Axes or list of matplotlib.axes.Axes) – The axes contained within fig which will be adjusted alongside the figure resizing.

  • finalSize (str or float or tuple or list) – The target size for the figure. Can be “quarter”, “half”, “full”, a float representing a fraction of the full size, or a tuple/list specifying the width and height in the chosen units.

  • units (str, optional) – The units of measurement for specifying custom sizes, by default “mm”. Other valid options are “cm” and “inches”.

Returns:

fig – The resized matplotlib figure object.

Return type:

matplotlib.figure.Figure

linkinglines.PlotUtils.labelSubplots(ax, labels=None, **kwargs)

Add alphabetical labels to the corner of each subplot.

This function labels each subplot in a figure with an alphabetical character, starting from “A”. It can enhance the clarity of figures containing multiple subplots by providing a simple reference system.

Parameters:
  • ax (list, dict, or ndarray) – A collection of AxesSubplot objects to be labeled. If a single AxesSubplot is provided, it will be converted into a list automatically.

  • labels (list of str, optional) – Custom labels to use instead of the default alphabetical labels. The list must have the same length as the ax collection. If None, labels will default to “A”, “B”, “C”, etc.

  • **kwargs (dict) – Additional keyword arguments to be passed to the label positioning function.

Examples

>>> import matplotlib.pyplot as plt
>>> fig, axs = plt.subplots(2, 2)
>>> labelSubplots(axs.flatten())
>>> plt.show()

Note

The function supports lists, dictionaries, or arrays of AxesSubplot objects. In the case of a single AxesSubplot, it will be handled appropriately by converting to a list format internally.

linkinglines.PlotUtils.labelcolors(labels, colormap)

Assigns colors to unique labels using a colormap.

This function takes a list of labels and a colormap and assigns a unique color to each unique label based on the colormap. It returns a list of colors corresponding to the input labels.

Parameters:
  • labels (list or pandas.Series) – A list of labels.

  • colormap (matplotlib.colors.Colormap) – A colormap to assign colors from.

Returns:

  • colors (list) – A list of colors in hexadecimal format (#RRGGBB) corresponding to the input labels.

  • colors_short (list) – A list of short color names (e.g., ‘red’, ‘blue’) corresponding to the input labels.

  • Example

  • ——-

  • >>> # Assign colors to unique labels using a colormap

  • >>> labels = [‘A’, ‘B’, ‘A’, ‘C’, ‘B’]

  • >>> colormap = plt.get_cmap(‘viridis’)

  • >>> label_colors, short_colors = labelcolors(labels, colormap)

linkinglines.PlotUtils.plotBreak(xlim, x, y, ax1, ax2, marker, **kwargs)

function to plot breakpoints

linkinglines.PlotUtils.plotByLoc(lines, col, log_scale=(False, False))

Plot histograms of a column against Xmid and Ymid, color-coded by the column values.

This function creates a 2x2 subplot grid where the left side displays histograms of a specified column against ‘Xmid’ and ‘Ymid’, and the right side shows color bars corresponding to the histograms. The histograms are color-coded based on the values in the specified column, providing a visual distribution of the data. Options to apply a logarithmic scale to the x and y axes of the histograms are also available.

Parameters:
  • lines (pandas.DataFrame) – The DataFrame containing the data to be plotted. Must include ‘Xmid’, ‘Ymid’, and the specified column col.

  • col (str) – The name of the column in lines DataFrame to plot and color-code against ‘Xmid’ and ‘Ymid’.

  • log_scale (tuple of bool, optional) – Specifies whether to apply a logarithmic scale on the x-axis and y-axis of the histograms, respectively. Format is (x_log, y_log). Default is (False, False).

Returns:

  • fig (matplotlib.figure.Figure) – The figure object containing the plots.

  • axs (list of matplotlib.axes._subplots.AxesSubplot) – A list containing the two main axes objects for the histograms.

  • axins (list of matplotlib.axes._axes.Axes) – A list containing the two inset axes objects for the color bars.

linkinglines.PlotUtils.plotRadialOver(fig, ax1, ax2, xc, yc, Crange=50000, n=4, step=None, color='gray', color2='red', colorLines=False)
linkinglines.PlotUtils.plotRatioLine(ax, x, ratio, line_kw=None)

Plot a line with a specified ratio.

This function plots a line on a given axis with a specified ratio (slope) by specifying the x values. You can customize the appearance of the line using the line_kw argument.

Parameters:
  • ax (matplotlib.axes._subplots.AxesSubplot) – The axis object on which to plot the line.

  • x (array-like) – The x values for the line.

  • ratio (float) – The desired slope (ratio) of the line.

  • line_kw (dict, optional) – A dictionary of keyword arguments to customize the line’s appearance (e.g., color, linestyle, label).

Returns:

  • ax (matplotlib.axes._subplots.AxesSubplot) – The modified axis object.

  • l (list of matplotlib.lines.Line2D) – A list containing the line objects created.

  • Example

  • ——– – # Plot a line with a 1:2 slope (y = 0.5 * x) fig, ax = plt.subplots() ax, line = plotRatioLine(ax, x=[0, 10], ratio=0.5, line_kw={‘color’: ‘red’, ‘linestyle’: ‘–’, ‘label’: ‘Line’}) ax.legend()

linkinglines.PlotUtils.plotScatterHist(lines, x, y, hue=None, hue_norm=None, xlim=None, ylim=None, log_scale=(False, False), palette='Spectral', style=None, **kwargs)

Plot a scatter plot with marginal histograms for two variables and optionally color by a third variable.

This function plots data from a DataFrame as a scatter plot along with histograms for the x and y variables on the top and right margins, respectively. The scatter plot points can be colored based on a third variable. The histograms help visualize the distribution of the x and y variables, providing a comprehensive view of the data.

Parameters:
  • lines (pandas.DataFrame) – The DataFrame containing the data to be plotted.

  • x (str) – The column name in lines to be used for x-axis values in the scatter plot.

  • y (str) – The column name in lines to be used for y-axis values in the scatter plot.

  • hue (str, optional) – The column name in lines whose values are used to color the data points in the scatter plot. Default is None.

  • hue_norm (tuple, optional) – The normalization range (min, max) for the hue variable. Applies only if hue is not None. Default is None.

  • xlim (tuple, optional) – The limits for the x-axis as a tuple (min, max). Default is None, which autoscales the x-axis.

  • ylim (tuple, optional) – The limits for the y-axis as a tuple (min, max). Default is None, which autoscales the y-axis.

  • log_scale (tuple, optional) – A tuple of booleans specifying whether to apply a logarithmic scale to the x and y axes, respectively. Format is (x_log, y_log). Default is (False, False).

  • palette (str or list, optional) – The color palette for the hue variable. Can be a string specifying a seaborn palette or a list of colors. Default is ‘Spectral’.

  • style (str, optional) – The marker style for the scatter plot. For example, ‘o’ for circles, ‘s’ for squares. Default is None, which uses default markers.

  • **kwargs (dict) – Additional keyword arguments passed to the scatter plot function.

Returns:

  • matplotlib.figure.Figure – The figure object containing the scatter plot and marginal histograms.

  • list of matplotlib.axes._subplots.AxesSubplot – A list containing the axes objects for the scatter plot, x-axis histogram, and y-axis histogram.

Example

>>> data = pd.DataFrame({'theta': np.random.rand(100), 'rho': np.random.rand(100), 'label': np.random.choice(['1', '2', '3'], 100)})
>>> fig, axes = plotScatterHist(data, 'theta', 'rho', hue='label', palette='viridis')
linkinglines.PlotUtils.plotlines(data, col, ax, alpha=1, myProj=None, maskar=None, linewidth=1, ColorBy=None, center=False, xc=None, yc=None, extend=False, cmap=<matplotlib.colors.ListedColormap object>, cbarStatus=False, SpeedUp=True, equal=True)

Plots line segments on a specified axis with optional attributes for color, transparency, and more.

This function allows for the visualization of line segments from a DataFrame containing their start and end UTM coordinates. It offers various customization options including color coding by data attributes, transparency, line extension, and more.

Parameters:
  • data (pandas.DataFrame) – DataFrame containing ‘Xstart’, ‘Xend’, ‘Yend’, ‘Ystart’ columns representing UTM coordinates of line segments.

  • col (str or RGB tuple) – Color for plotting the lines, specified as a color name string or an RGB tuple (e.g., (0.5, 0.5, 0.5)).

  • ax (matplotlib.axes._axes.Axes) – Axes object for plotting.

  • alpha (float, optional) – Transparency level of lines (0.0 transparent, 1.0 opaque). Default is 1.

  • myProj (pyproj.Proj, optional) – PyProj projection object for converting UTM coordinates to lat/long. Default is None.

  • maskar (array-like, optional) – Logical mask indicating which lines to plot. Default is None.

  • linewidth (float, optional) – Width of the plotted lines. Default is 1.

  • ColorBy (str, optional) – Column name from data to color lines based on its values. Default is None.

  • center (bool, optional) – Whether to plot the center point. Default is False.

  • xc (float, optional) – X-coordinate of the center point, required if center is True. Default is None.

  • yc (float, optional) – Y-coordinate of the center point, required if center is True. Default is None.

  • extend (bool, optional) – If True, extends lines beyond their endpoints. Default is False.

  • cmap (matplotlib.colors.Colormap, optional) – Colormap for coloring lines based on ColorBy. Default is cm.turbo.

  • cbarStatus (bool, optional) – If True, displays a colorbar for color-coded lines. Default is False.

  • SpeedUp (bool, optional) – If True, downsamples data for faster plotting with large datasets. Default is True.

  • equal (bool, optional) – If True, sets aspect ratio to equal, maintaining scale. Default is True.

Return type:

None

Example

>>> plotlines(data, 'red', ax, alpha=0.6, ColorBy='Theta', cbarStatus=True)
>>> # Plots line segments in red with alpha transparency, color lines based on 'Theta' column, and displays a colorbar.
linkinglines.PlotUtils.pltRec(lines, xc, yc, fig=None, ax=None)

Plot a rectangle defined by the center and lines, illustrating the orientation and dimensions.

This function creates a visualization of a rectangle that represents the orientation and dimensions of a set of lines with respect to a given center point. It optionally takes a matplotlib figure and axis for plotting or creates new ones if not provided. The rectangle is determined based on the aggregate properties (angles and lengths) of the lines DataFrame.

Parameters:
  • lines (pandas.DataFrame) – A DataFrame containing line data, with columns for angles (‘theta’) and lengths (‘rho’).

  • xc (float) – The x-coordinate of the center point.

  • yc (float) – The y-coordinate of the center point.

  • fig (matplotlib.figure.Figure, optional) – The figure object for the plot. If None, a new figure is created.

  • ax (matplotlib.axes._subplots.AxesSubplot, optional) – The axes object for the plot. If None, new axes are created on the provided or new figure.

Returns:

  • fig (matplotlib.figure.Figure) – The figure object used for plotting.

  • ax (matplotlib.axes._subplots.AxesSubplot) – The axes object used for plotting.

  • length (float) – The calculated length of the rectangle.

  • width (float) – The calculated width of the rectangle.

Examples

>>> # Example DataFrame of lines
>>> import pandas as pd
>>> lines_df = pd.DataFrame({'theta': [0, 90, 45], 'rho': [5, 5, 7]})
>>> xc, yc = 0, 0  # Center point
>>> fig, ax, length, width = pltRec(lines_df, xc, yc)
>>> plt.show()

Note

The orientation of the rectangle is determined by the mean angle of the lines, and its dimensions are based on the mean and range of the lengths (rho) of the lines. The rectangle is plotted with its center at the given (xc, yc) point, rotated to match the average orientation of the lines.

linkinglines.PlotUtils.splitData(xlim, x)

function to split data into two groups based on xlim assume only one breakpoint

linkinglines.PrePostProcess module

Created on Thu Jul 1 11:04:53 2021

@author: akh

Contains various preprocessing data and post processing including reading in WKT files and exporting WKT files to use in GIS programs

writeFile makes the dataframe into a variety of file types including .csv, .txt, .shp, .geojson, and .json readFile reads in a file and returns a pandas dataframe writeToWKT writes a dataframe to a CSV file using WKT writetoGeoData writes a dataframe to a shapefile, geopackage, or geojson file WKTtoArray processes from a WKT CSV to a pandas Dataframe with columns Xstart,Ystart,Xend,Yend,seg_length giveID gives a numeric ID to data midPoint: Finds the midpoint of a dataframe of line segments. giveHashID: assigns hash ID to a dataframe based on line endpoints. segLength: calculates segment length transformXstart: reorders dataframe so that Xstart is always < Xend DikesetReprocess: Reprocesses a dataframe containing dike line data to ensure it has essential attributes and is properly formatted. LinesReprocess: Reprocesses a dataframe containing line data to ensure it has essential attributes and is properly formatted. preProcess: Fully preprocesses a dataframe containing line data to ensure it has essential attributes and is properly formatted. whichForm: Returns the form of the dataframe column names MaskArea: Returns dataframe masked by bounds getCartLimits: Computes the Cartesian limits (x and y) of a set of lines.

linkinglines.PrePostProcess.FilterLines(lines)

Filters lines based on a trust filter.

Parameters:

lines (pandas.DataFrame) – The input dataframe containing line data.

Returns:

A filtered dataframe containing only the lines marked as trusted (TrustFilter == 1).

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.LinesReProcess(df, HTredo=True)

Reprocesses a dataframe containing line data to ensure it has essential attributes and is properly formatted.

Parameters:
  • df (pandas.DataFrame) – The input dataframe containing line data.

  • HTredo (bool, default True) – Indicates whether to recalculate Hough Transform attributes, by default True.

Returns:

The processed dataframe with added or updated attributes.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.MaskArea(df, bounds)

Masks a dataframe by specified bounds.

Parameters:
  • df (pandas.DataFrame) – A dataframe with columns ‘Xstart’ and ‘Ystart’.

  • bounds (list or tuple) – The bounding box specified as [x1, y1, x2, y2], where x1<x2 and y1<y2.

Returns:

A masked dataframe containing only the rows within the specified bounds.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.WKTtoArray(df, plot=False)

Processes a dataframe with WKT strings to a pandas dataframe with explicit geometry columns.

Parameters:
  • df (pandas.DataFrame) – Dataframe with a ‘WKT’ or ‘geometry’ column.

  • plot (bool, optional) – Whether to plot the processed lines, by default False.

Returns:

Dataframe with columns [‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’, ‘seg_length’].

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.dikesetReProcess(df, HTredo=True, xc=None, yc=None)

Reprocesses a dataframe containing dike line data to ensure it has essential attributes and is properly formatted.

Parameters:
  • df (pandas.DataFrame) – Input dataframe containing dike line data.

  • HTredo (bool, default True) – Whether to recalculate Hough Transform attributes, by default True.

  • xc (float, optional) – X-coordinate of the center point for the Hough Transform, by default None.

  • yc (float, optional) – Y-coordinate of the center point for the Hough Transform, by default None.

Returns:

df – Processed dataframe with added or updated attributes.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.getCartLimits(lines)

Computes the Cartesian limits (x and y) of a set of lines.

Parameters:

lines (pandas.DataFrame) – The input dataframe containing line data.

Returns:

  • xlim (float) – The limits of the x-axis.

  • ylim (float) – The limits of the y-axis.

linkinglines.PrePostProcess.giveHashID(df)

Assigns a hash ID to a dataframe based on line endpoints.

Parameters:

df (pandas.DataFrame) – Dataframe with line segment data.

Returns:

Dataframe with an added ‘HashID’ column based on the hash of line endpoints.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.giveID(df)

Assigns a numeric ID to the dataframe rows.

Parameters:

df (pandas.DataFrame) – The input dataframe.

Returns:

The dataframe with an added ‘ID’ column.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.midPoint(df)

Finds the midpoint of a dataframe of line segments.

Parameters:

df (pandas.DataFrame) – Dataframe containing the line segments with columns [“Xstart”, “Ystart”, “Xend”, “Yend”].

Returns:

df – Dataframe with new columns [‘Xmid’, ‘Ymid’] indicating the midpoints of the line segments.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.preProcess(data)
Fully preprocesses a dataframe containing line data to ensure it has essential attributes and is properly formatted.
  1. Convert WKT column to array format if present

  2. Transform Xstart, so Xstart < Xend

  3. calculate segment lengths

  4. assign unique hash IDs

  5. calculate midpoints

  6. calculate Hough Transform attributes (theta, rho, xc, yc)

  7. calculate perpendicular offset distances

  8. assign the processing date

  9. remove duplicate entries and report any found duplicates

Parameters:

data (pandas.DataFrame) – The input dataframe containing line data.

Returns:

The fully processed dataframe with all required attributes.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.readFile(name, preprocess=True)

Reads in a file and returns a pandas dataframe.

Parameters:
  • name (str) – The path to the file to be read in.

  • preprocess (bool, optional) – Indicates whether to preprocess the data, by default True.

Returns:

data – A pandas or geopandas dataframe containing the read data.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.transformXstart(dikeset, HTredo=True)

Ensures that ‘Xstart’ is always less than ‘Xend’ in a dataframe of line segments.

Parameters:

df (pandas.DataFrame) – Dataframe with line segments.

Returns:

df – Transformed dataframe where ‘Xstart’ < ‘Xend’.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.whichForm(lines)

Identifies the form of the dataframe column names.

Parameters:

lines (pandas.DataFrame) – A dataframe with columns containing theta and rho values.

Returns:

A tuple (t, r) containing the string identifiers for theta and rho columns.

Return type:

tuple

linkinglines.PrePostProcess.writeFile(df, name, myProj=None)

Writes a dataframe to a file based on the file extension.

Parameters:
  • df (pandas.DataFrame) – A pandas dataframe to be written to file.

  • name (str) – The name of the file to be written, including the file extension.

  • myProj (str, optional) – The projection of the dataframe, if applicable. Default is None.

Returns:

The input dataframe.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.writeToWKT(df, name, myProj=None)

Writes a dataframe to a CSV file using Well-Known Text (WKT) format for line vectors.

Parameters:
  • df (pandas.DataFrame) – Dataframe with columns [‘Xstart’, ‘Ystart’, ‘Xend’, ‘Yend’, ‘seg_length’].

  • name (str) – Name of the output file.

  • myProj (pyproj.CRS, optional) – Projection of the dataframe. If None, WGS84 is assumed, by default None.

Returns:

The input dataframe with an additional ‘Linestring’ column.

Return type:

pandas.DataFrame

linkinglines.PrePostProcess.writetoGeoData(df, name, driver, myProj=None)

Writes a dataframe to a geospatial file (shapefile, geopackage, or geojson).

Parameters:
  • df (pandas.DataFrame) – Dataframe with geospatial data to be written.

  • name (str) – Name of the output file.

  • driver (str) – Format of the file (e.g., ‘ESRI Shapefile’, ‘GeoJSON’).

  • myProj (pyproj.CRS, optional) – Projection of the dataframe. If None, WGS84 is assumed, by default None.

Returns:

The input dataframe.

Return type:

pandas.DataFrame

linkinglines.SyntheticLines module

Created on Wed Apr 21 14:07:06 2021

This module contains functions for generating synthetic dike data and performing various operations on dike datasets. These functions are primarily used for creating, manipulating, and analyzing synthetic dike or fracture datasets for geological studies and modeling purposes.

Functions: - makeRadialSwarmdf: Generate a DataFrame containing radial swarm dike data. - makeCircumfrentialSwarmdf: Generate a DataFrame containing circumferential swarm dike data. - addSwarms: Combine multiple swarm DataFrames into a single DataFrame. - makeLinearDataFrame: Generate a DataFrame containing linear dike data with angle and rho distributions. - EnEchelonSynthetic: Generate a DataFrame containing en echelon synthetic dike data. - fromHT: Generate a DataFrame containing dike data from angles and rhos using the HT method. - fragmentDikes: Fragment dike segments into smaller segments.

@author: akh

linkinglines.SyntheticLines.EnEchelonSynthetic(ndikes, angle, RhoStart, RhoSpacing, Overlap=0, CartRange=100000)

Generate a DataFrame containing en echelon synthetic dike data.

Parameters:
  • (int) (ndikes) –

  • (float) (CartRange) –

  • (float)

  • (float)

  • (float)

  • (float)

Returns:

df (pandas.DataFrame)

Return type:

A DataFrame containing en echelon synthetic dike data.

linkinglines.SyntheticLines.addSwarms(dflist)

Combine multiple swarm DataFrames into a single DataFrame.

Parameters:

DataFrames) (dflist (list of) –

Returns:

dfSwarm (pandas.DataFrame)

Return type:

A combined DataFrame containing swarm dike data.

linkinglines.SyntheticLines.fragmentDikes(df, nSegments=5)

Fragment dike segments into smaller segments.

Parameters:

(pandas.DataFrame) (df) –

Returns:

dfFragmented (pandas.DataFrame)

Return type:

A DataFrame containing fragmented dike segments.

linkinglines.SyntheticLines.fromHT(angles, rhos, scale=10000, length=10000, xc=0, yc=0, CartRange=100000, label=None, xrange=None, test=False)

Generate a DataFrame containing dike data from angles and rhos using the HT method.

Parameters:
  • (array-like) (rhos) –

  • (array-like)

  • (float) (CartRange) –

  • (float)

  • (float)

  • (float)

  • (float)

  • None) (xrange (float or) –

  • None)

  • (bool) (test) –

Returns:

df (pandas.DataFrame)

Return type:

A DataFrame containing dike data generated from HT parameters.

linkinglines.SyntheticLines.makeCircumfrentialSwarmdf(radius, lenf=1, anglestart=-90, anglestop=90, ndikes=50, center=[0, 0], label=1, CartRange=100000)

Generate a DataFrame containing circumferential swarm dike data.

Parameters:
  • (float) (CartRange) –

  • (float)

  • (float)

  • (float)

  • (int) (label) –

  • (list) (center) –

  • (int)

  • (float)

Returns:

df (pandas.DataFrame)

Return type:

A DataFrame containing circumferential swarm dike data.

linkinglines.SyntheticLines.makeLinearDataFrame(length, angle, angleSTD, rho, rhoSTD, ndikes=100, CartRange=300000, label=None)

Generate a DataFrame containing linear dike data with angle and rho distributions.

Parameters

length (float): The length of the dike segments. angle (float): The mean angle of dike orientation (in degrees). angleSTD (float): The standard deviation of the angle distribution. rho (float): The mean rho value of dike segments. rhoSTD (float): The standard deviation of the rho distribution. ndikes (int): The number of dikes to generate. CartRange (float): The Cartesian range to filter dikes based on coordinates. label (int or None): The label to assign to the generated dikes. If None, labels will be assigned automatically.

Returns

df (pandas.DataFrame): A DataFrame containing linear dike data.

linkinglines.SyntheticLines.makeRadialSwarmdf(radius, doubled=True, anglestart=-90, anglestop=90, ndikes=50, center=[0, 0], label=1, CartRange=100000)

Generate a DataFrame containing radial swarm dike data.

Parameters:
  • (float) (CartRange) –

  • (bool) (doubled) –

  • (float)

  • (float)

  • (int) (label) –

  • (list) (center) –

  • (int)

  • (float)

Returns:

DataFrame

Return type:

A DataFrame containing radial swarm dike data.

Module contents