Skip to content

Latest commit

 

History

History
51 lines (46 loc) · 5.07 KB

File metadata and controls

51 lines (46 loc) · 5.07 KB

@epubknowledge/common/misc

General-purpose string and regex utilities.

import {
  tagStripper,
  zeroNum,
  finder,
  replacer,
  classChanger,
  loopReplacer,
  nukeLine,
  alphaToNum,
  validateString,
  verseCompare,
  tabGen,
  toTitleCase,
  applyRules,
  trimElementWhitespace,
} from '@epubknowledge/common/misc'
Export Description
tagStripper(s) Strips all HTML tags from a string
zeroNum(num) Pads a number with a leading zero if less than 10
finder(text) Returns a RegExp that matches text (with optional override suffix) only when it is immediately preceded by the literal "" sequence via the lookbehind (?<="")
replacer(input, findText, replaceText) Replaces the first (or all, if findText is a global regex) occurrences in input
classChanger(input, findText, replaceText) Replaces a class name matched by the override pattern
loopReplacer(data, findText, replaceText, maxIterations?) Repeatedly replaces all matches of findText in data until none remain or the iteration cap is reached
nukeLine(data, nuke) Removes lines that start with one or more tabs and match nuke
alphaToNum(str) Converts a letter or string of letters to their 1-based alphabetical index ("a"1, etc.)
validateString(str) Lowercases str and removes every non-alphabetic character; returns "" for null/undefined
verseCompare(str) Strips everything except digits, colons, and hyphens (for verse reference comparison)
tabGen(num?, tab?) Generates num repetitions of tab (defaults: 1, "\t")
toTitleCase(str) Title-cases each word while preserving any leading/trailing spaces
applyRules(data, rules) Applies a sequence of ApplyRule transforms to an HTML string (element swap, class strip, or rename)
trimElementWhitespace(str) Removes whitespace immediately preceding closing tags for block-level text elements (h1h6, li, td, th)

Also exports regex-building constants:

Export Description
nlg String fragment containing a newline and three tabs. Type: string.
bibleRegex Raw regex fragment for capturing 1 to 3 digits. Type: string.
pattern Raw regex fragment for class/id-style text such as letters, digits, and punctuation. Type: string.
override Raw regex fragment for an optional SomeOverride-1-style suffix. Type: string.
unicode Raw regex fragment listing supported Unicode code points for text matching. Type: string.
wordBoundary Raw regex fragment for word-like tokens, including the supported Unicode set. Type: string.
divRegex RegExp that matches indented opening and closing <div> lines with id or class attributes.