Probabilistic context-free grammar

Probabilistic context-free grammar

In theoretical linguistics and computational linguistics, probabilistic context free grammars (PCFGs) extend context-free grammars, similar to how hidden Markov models extend regular grammars. Each production is assigned a probability. The probability of a derivation (parse) is the product of the probabilities of the productions used in that derivation. These probabilities can be viewed as parameters of the model, and for large problems it is convenient to learn these parameters via machine learning. A probabilistic grammar's validity is constrained by context of its training dataset. PCFGs originated from grammar theory, and have application in areas as diverse as natural language processing to the study the structure of RNA molecules and design of programming languages. Designing efficient PCFGs has to weigh factors of scalability and generality. Issues such as grammar ambiguity must be resolved. The grammar design affects results accuracy. Grammar parsing algorithms have various time and memory requirements. == Definitions == Derivation: The process of recursive generation of strings from a grammar. Parsing: Finding a valid derivation using an automaton. Parse Tree: The alignment of the grammar to a sequence. An example of a parser for PCFG grammars is the pushdown automaton. The algorithm parses grammar nonterminals from left to right in a stack-like manner. This brute-force approach is not very efficient. In RNA secondary structure prediction variants of the Cocke–Younger–Kasami (CYK) algorithm provide more efficient alternatives to grammar parsing than pushdown automata. Another example of a PCFG parser is the Stanford Statistical Parser which has been trained using Treebank. == Formal definition == Similar to a CFG, a probabilistic context-free grammar G can be defined by a quintuple: G = ( M , T , R , S , P ) {\displaystyle G=(M,T,R,S,P)} where M is the set of non-terminal symbols T is the set of terminal symbols R is the set of production rules S is the start symbol P is the set of probabilities on production rules == Relation with hidden Markov models == PCFGs models extend context-free grammars the same way as hidden Markov models extend regular grammars. The Inside-Outside algorithm is an analogue of the Forward-Backward algorithm. It computes the total probability of all derivations that are consistent with a given sequence, based on some PCFG. This is equivalent to the probability of the PCFG generating the sequence, and is intuitively a measure of how consistent the sequence is with the given grammar. The Inside-Outside algorithm is used in model parametrization to estimate prior frequencies observed from training sequences in the case of RNAs. Dynamic programming variants of the CYK algorithm find the Viterbi parse of a RNA sequence for a PCFG model. This parse is the most likely derivation of the sequence by the given PCFG. == Grammar construction == Context-free grammars are represented as a set of rules inspired from attempts to model natural languages. The rules are absolute and have a typical syntax representation known as Backus–Naur form. The production rules consist of terminal { a , b } {\displaystyle \left\{a,b\right\}} and non-terminal S symbols and a blank ϵ {\displaystyle \epsilon } may also be used as an end point. In the production rules of CFG and PCFG the left side has only one nonterminal whereas the right side can be any string of terminal or nonterminals. In PCFG nulls are excluded. An example of a grammar: S → a S , S → b S , S → ϵ {\displaystyle S\to aS,S\to bS,S\to \epsilon } This grammar can be shortened using the '|' ('or') character into: S → a S | b S | ϵ {\displaystyle S\to aS|bS|\epsilon } Terminals in a grammar are words and through the grammar rules a non-terminal symbol is transformed into a string of either terminals and/or non-terminals. The above grammar is read as "beginning from a non-terminal S the emission can generate either a or b or ϵ {\displaystyle \epsilon } ". Its derivation is: S ⇒ a S ⇒ a b S ⇒ a b b S ⇒ a b b {\displaystyle S\Rightarrow aS\Rightarrow abS\Rightarrow abbS\Rightarrow abb} Ambiguous grammar may result in ambiguous parsing if applied on homographs since the same word sequence can have more than one interpretation. Pun sentences such as the newspaper headline "Iraqi Head Seeks Arms" are an example of ambiguous parses. One strategy of dealing with ambiguous parses (originating with grammarians as early as Pāṇini) is to add yet more rules, or prioritize them so that one rule takes precedence over others. This, however, has the drawback of proliferating the rules, often to the point where they become difficult to manage. Another difficulty is overgeneration, where unlicensed structures are also generated. Probabilistic grammars circumvent these problems by ranking various productions on frequency weights, resulting in a "most likely" (winner-take-all) interpretation. As usage patterns are altered in diachronic shifts, these probabilistic rules can be re-learned, thus updating the grammar. Assigning probability to production rules makes a PCFG. These probabilities are informed by observing distributions on a training set of similar composition to the language to be modeled. On most samples of broad language, probabilistic grammars where probabilities are estimated from data typically outperform hand-crafted grammars. CFGs when contrasted with PCFGs are not applicable to RNA structure prediction because while they incorporate sequence-structure relationship they lack the scoring metrics that reveal a sequence structural potential == Weighted context-free grammar == A weighted context-free grammar (WCFG) is a more general category of context-free grammar, where each production has a numeric weight associated with it. The weight of a specific parse tree in a WCFG is the product (or sum ) of all rule weights in the tree. Each rule weight is included as often as the rule is used in the tree. A special case of WCFGs are PCFGs, where the weights are (logarithms of ) probabilities. An extended version of the CYK algorithm can be used to find the "lightest" (least-weight) derivation of a string given some WCFG. When the tree weight is the product of the rule weights, WCFGs and PCFGs can express the same set of probability distributions. == Applications == === RNA structure prediction === Since the 1990s, PCFG has been applied to model RNA structures. Energy minimization and PCFG provide ways of predicting RNA secondary structure with comparable performance. However structure prediction by PCFGs is scored probabilistically rather than by minimum free energy calculation. PCFG model parameters are directly derived from frequencies of different features observed in databases of RNA structures rather than by experimental determination as is the case with energy minimization methods. The types of various structure that can be modeled by a PCFG include long range interactions, pairwise structure and other nested structures. However, pseudoknots can not be modeled. PCFGs extend CFG by assigning probabilities to each production rule. A maximum probability parse tree from the grammar implies a maximum probability structure. Since RNAs preserve their structures over their primary sequence, RNA structure prediction can be guided by combining evolutionary information from comparative sequence analysis with biophysical knowledge about a structure plausibility based on such probabilities. Also search results for structural homologs using PCFG rules are scored according to PCFG derivations probabilities. Therefore, building grammar to model the behavior of base-pairs and single-stranded regions starts with exploring features of structural multiple sequence alignment of related RNAs. S → a S a | b S b | a a | b b {\displaystyle S\to aSa|bSb|aa|bb} The above grammar generates a string in an outside-in fashion, that is the basepair on the furthest extremes of the terminal is derived first. So a string such as a a b a a b a a {\displaystyle aabaabaa} is derived by first generating the distal a's on both sides before moving inwards: S ⇒ a S a ⇒ a a S a a ⇒ a a b S b a a ⇒ a a b a a b a a {\displaystyle S\Rightarrow aSa\Rightarrow aaSaa\Rightarrow aabSbaa\Rightarrow aabaabaa} A PCFG model extendibility allows constraining structure prediction by incorporating expectations about different features of an RNA . Such expectation may reflect for example the propensity for assuming a certain structure by an RNA. However incorporation of too much information may increase PCFG space and memory complexity and it is desirable that a PCFG-based model be as simple as possible. Every possible string x a grammar generates is assigned a probability weight P ( x | θ ) {\displaystyle P(x|\theta )} given the PCFG model θ {\displaystyle \theta } . It follows that the sum of all probabilities to all possible grammar productions is ∑ x P ( x | θ ) = 1 {\displaystyle \sum _{\text{x}}P(x|\theta )=1} . The scores

Contract management software

Contract management software constitutes software and associated data management used to support contract management, contract lifecycle management, and contractor management on projects in the procurement of goods and services. It may be used together with project management software. == History == Historically, contract management was seen as a "paper-intensive" process. Early steps from the early 2000's reported by the Aberdeen Group required extensive data conversion work to enable documents to be handled electronically. With the adoption of the European Union's General Data Protection Regulation (GDPR) in 2016, companies needed to take additional steps in regards to contract management. Each data responsible entity was obliged to sign data processing agreements (DPAs) with the various vendors, who treat personal data on behalf of the data responsible. DPAs need to be regularly controlled, adjusted and renewed, which adds an extra agreement to such vendors or at least an extra DPA addendum to each agreement. By 2018, Ardent Partner's research had found that software used for automating contract management activities was being more extensively used among major companies or businesses with "Best-in-Class" procurement teams. Contract management process automation was found to be closely linked with more effective internal business collaboration, standardization and risk management. == Advantages and key functions == Using contract management software can have multiple benefits compared to manually managing paper contracts. This software can help keep track of multiple activities and can have features for automating administration, ensuring compliance, monitoring risk, running reports and triggering alerts. In addition to these types of features, contract management software systems provide a centralized repository for employees to quickly access all contracts worldwide in one place. Contract management software is produced by many companies, working on a range of scales and offering varying degrees of customizability. Basic functions should include the ability to store contract documents, track changes to contract documents, search documents for a particular criterion, send key date alerts and to report required aspects of the contract. Other functions include managing a new contract request, capturing related data, following a document through a review and approval process, and collecting digital signatures. Contract management software may also be an aid to project portfolio management and spend analysis, and may also monitor KPIs. Leading contract management software provides contract visibility, monitoring, and compliance to automate and streamline the contract lifecycle process. Contract management software which uses artificial intelligence (AI) can identify contract types based on pattern recognition. AI contracting software trains its algorithms on a set of contract data to recognize patterns and extract variables such as clauses, dates, and parties. It also offers simple prediction capabilities, by sorting through a large volume of contracts and flagging individual contracts based on specified criteria. AI software can also read contracts in multiple formats and languages, extract contract data, and provide analytics. It can reduce the risk of human error in contract drafting and review. A centralized repository provides a critical advantage allowing for all contract documents to be stored within one location. Having contracts stored in multiple locations can delay and interrupt the contracting process. == Contract risk management software (CRMS) for capital projects == Very large enterprises, such as capital expenditure (capex) projects, involve multiple parties and high risk and uncertainty. They are unlike traditional operating contracts in that they are subject to shared deadlines in unique situations. As the complexity of these unique projects increases, the relationships between parties become more important. This requires contract management software, or contract risk management software (CRMS), to become more dynamic and responsive. The terms of these capex contracts necessarily involve assumptions at the start of the process and are likely to change over the lifetime of the project lifecycle. For this reason, CRMS must be capable of recording one single instance of agreed changes to contract terms and incorporating these changes in an auditable and legally robust way. With multiple decision makers involved, CRMS should also make accountability more transparent and enable faster decisions about variation proposals.

Greedy embedding

In distributed computing and geometric graph theory, greedy embedding is a process of assigning coordinates to the nodes of a telecommunications network in order to allow greedy geographic routing to be used to route messages within the network. Although greedy embedding has been proposed for use in wireless sensor networks, in which the nodes already have positions in physical space, these existing positions may differ from the positions given to them by greedy embedding, which may in some cases be points in a virtual space of a higher dimension, or in a non-Euclidean geometry. In this sense, greedy embedding may be viewed as a form of graph drawing, in which an abstract graph (the communications network) is embedded into a geometric space. The idea of performing geographic routing using coordinates in a virtual space, instead of using physical coordinates, is due to Rao et al. Subsequent developments have shown that every network has a greedy embedding with succinct vertex coordinates in the hyperbolic plane, that certain graphs including the polyhedral graphs have greedy embeddings in the Euclidean plane, and that unit disk graphs have greedy embeddings in Euclidean spaces of moderate dimensions with low stretch factors. == Definitions == In greedy routing, a message from a source node s to a destination node t travels to its destination by a sequence of steps through intermediate nodes, each of which passes the message on to a neighboring node that is closer to t. If the message reaches an intermediate node x that does not have a neighbor closer to t, then it cannot make progress and the greedy routing process fails. A greedy embedding is an embedding of the given graph with the property that a failure of this type is impossible. Thus, it can be characterized as an embedding of the graph with the property that for every two nodes x and t, there exists a neighbor y of x such that d(x,t) > d(y,t), where d denotes the distance in the embedded space. == Graphs with no greedy embedding == Not every graph has a greedy embedding into the Euclidean plane; a simple counterexample is given by the star K1,6, a tree with one internal node and six leaves. Whenever this graph is embedded into the plane, some two of its leaves must form an angle of 60 degrees or less, from which it follows that at least one of these two leaves does not have a neighbor that is closer to the other leaf. In Euclidean spaces of higher dimensions, more graphs may have greedy embeddings; for instance, K1,6 has a greedy embedding into three-dimensional Euclidean space, in which the internal node of the star is at the origin and the leaves are a unit distance away along each coordinate axis. However, for every Euclidean space of fixed dimension, there are graphs that cannot be embedded greedily: whenever the number n is greater than the kissing number of the space, the graph K1,n has no greedy embedding. == Hyperbolic and succinct embeddings == Unlike the case for the Euclidean plane, every network has a greedy embedding into the hyperbolic plane. The original proof of this result, by Robert Kleinberg, required the node positions to be specified with high precision, but subsequently it was shown that, by using a heavy path decomposition of a spanning tree of the network, it is possible to represent each node succinctly, using only a logarithmic number of bits per point. In contrast, there exist graphs that have greedy embeddings in the Euclidean plane, but for which any such embedding requires a polynomial number of bits for the Cartesian coordinates of each point. == Special classes of graphs == === Trees === The class of trees that admit greedy embeddings into the Euclidean plane has been completely characterized, and a greedy embedding of a tree can be found in linear time when it exists. For more general graphs, some greedy embedding algorithms such as the one by Kleinberg start by finding a spanning tree of the given graph, and then construct a greedy embedding of the spanning tree. The result is necessarily also a greedy embedding of the whole graph. However, there exist graphs that have a greedy embedding in the Euclidean plane but for which no spanning tree has a greedy embedding. === Planar graphs === Papadimitriou & Ratajczak (2005) conjectured that every polyhedral graph (a 3-vertex-connected planar graph, or equivalently by Steinitz's theorem the graph of a convex polyhedron) has a greedy embedding into the Euclidean plane. By exploiting the properties of cactus graphs, Leighton & Moitra (2010) proved the conjecture; the greedy embeddings of these graphs can be defined succinctly, with logarithmically many bits per coordinate. However, the greedy embeddings constructed according to this proof are not necessarily planar embeddings, as they may include crossings between pairs of edges. For maximal planar graphs, in which every face is a triangle, a greedy planar embedding can be found by applying the Knaster–Kuratowski–Mazurkiewicz lemma to a weighted version of a straight-line embedding algorithm of Schnyder. The strong Papadimitriou–Ratajczak conjecture, that every polyhedral graph has a planar greedy embedding in which all faces are convex, remains unproven. === Unit disk graphs === The wireless sensor networks that are the target of greedy embedding algorithms are frequently modeled as unit disk graphs, graphs in which each node is represented as a unit disk and each edge corresponds to a pair of disks with nonempty intersection. For this special class of graphs, it is possible to find succinct greedy embeddings into a Euclidean space of polylogarithmic dimension, with the additional property that distances in the graph are accurately approximated by distances in the embedding, so that the paths followed by greedy routing are short.

Vinyl cutter

A vinyl cutter is an entry-level machine for making signs. Computer-designed vector files with patterns and letters are directly cut on the roll of vinyl which is mounted and fed into the vinyl cutter through USB or serial cable. Vinyl cutters are mainly used to make signs, banners and advertisements. Advertisements seen on automobiles and vans are often made with vinyl cut letters. While these machines were designed for cutting vinyl, they can also cut through computer and specialty papers, as well as thicker items like thin sheets of magnet. In addition to sign business, vinyl cutters are commonly used for apparel decoration. To decorate apparel, a vector design needs to be cut in mirror image, weeded, and then heat applied using a commercial heat press or a hand iron for home use. Some businesses use their vinyl cutter to produce both signs and custom apparel. Many crafters also have vinyl cutters for home use. These require little maintenance, and the vinyl can be bought in bulk relatively cheaply. Vinyl cutters are also often used by stencil artists to create single use or reusable stencil art and lettering == How it works == A vinyl cutter is a type of computer-controlled machine tool. The computer controls the movement of a sharp blade over the surface of the material as it would the nozzles of an ink-jet printer. This blade is used to cut out shapes and letters from sheets of thin self-adhesive plastic (vinyl). The vinyl can then be stuck to a variety of surfaces depending on the adhesive and type of material. To cut out a design, a vector-based image must be created using vector drawing software. Some vinyl cutters are marketed to small in-home businesses and require download and use of a proprietary editing software. The design is then sent to the cutter where it cuts along the vector paths laid out in the design. The cutter is capable of moving the blade on an X and Y axis over the material, cutting it into the required shapes. The vinyl material comes in long rolls allowing projects with significant length like banners or billboards to be easily cut. A major limitation with vinyl cutters is that they can only cut shapes from solid colours of vinyl, paper, card or thin plastic sheets such as Mylar. The type and thickness of material will vary for each cutter and how much downforce the cutter is capable of. If the material has no backing, a backing sheet, material or cutting mat and a temporary adhesive are needed to allow the cutter to cut through the material. A design with multiple colours must have each colour cut separately and then layered on top of each other as it is applied to the substrate. This is a process that is often applied in stencil art. Also, since the shapes are cut out of solid colours, photographs and gradients cannot be reproduced with a stand-alone cutter. === Design creation === Designs are created using vector-based software like Adobe Illustrator, FlexiSign, EasyCutPro, or other software. Vector artwork is either drawn with lines, shapes and text or images are vectorized thus create vector shapes. Most cutters (also called plotters) require special software to load/edit the artwork and communicate with the cutter. Computer designed images are loaded onto the vinyl cutter via a wired connection or over a wireless protocol. Then the vinyl is loaded into the machine where it is automatically fed through and cut to follow the set design. The vinyl can be placed on an adhesive mat to stabilize the vinyl when cutting smaller designs. === Types of vinyl === Adhesive vinyl is the type of vinyl used for store windows, car decals, signage, and more. Adhesive vinyl is applied with a transfer medium often called "transfer tape" or "carrier sheet". Heat transfer vinyl is the type of vinyl used to apply a design to fabric including t-shirts, tea towels, canvas bags, and more. Heat Transfer vinyl can be applied using a heat press or an iron, though the constant pressure and heat from a heat press is recommended by experts. === Using other materials === In addition to vinyl some cutters are capable of cutting other materials such as paper, card, plastic sheets and even thin wood. The thickness and type of material that can be cut will depend on the model of the cutter and heavily depends on the downforce. Cricut is a popular home cutter used by arts and craft enthusiasts since it allows for a wide use of different materials and is similar in size to a household printer and has strong downforce for its size. === Backing and cutting mat === If you cut material that doesn't have an adhesive backing you will require a cutting mat that you need to attach your material to. Some cutting mats are sticky, others will require you to use a temporary adhesive and/or masking tape to keep the material in place when cutting. === Cutting === The vinyl cutter uses a small knife or blade to precisely cut the outline of figures into a sheet or piece of vinyl, but not the release liner. The process of cutting vinyl material without penetrating it completely is referred to as "kiss cutting". The knife moves side to side and turns, while the vinyl is moved beneath the knife. The results from the cut process is an image cut into the material. === Weeding === The material is then 'weeded' where the excess parts of the figures are removed from the release liner. It is possible to remove the positive parts, which would give a negative decal, or remove the negative parts, giving a positive decal. Removing the figure would be like removing the positive, giving a negative image of the figures. === Transfer tape === A sheet of transfer tape with an adhesive backing is laid on the weeded vinyl when necessary. Heat Transfer vinyl often does not require use of a separate transfer tape. A roller is applied to the tape, causing it to adhere to the vinyl. The transfer tape and the weeded vinyl is pulled off the release liner, and applied to a substrate, such as a sheet of aluminium. This results in an aluminium sign with vinyl figures. == Uses == In addition to the capabilities of the cutter itself, adhesive vinyl comes in a wide variety of colors and materials including gold and silver foil, vinyl that simulates frosted glass, holographic vinyl, reflective vinyl, thermal transfer material, and even clear vinyl embedded with gold leaf. (Often used in the lettering on fire trucks and rescue vehicles.) As the vinyl film is supplied by the manufacturer, it comes attached to a release liner. == Challenges when cutting on a vinyl cutter == Cutting on a vinyl cutter requires careful calibration to achieve clean and accurate results, especially when the goal is to cut through only the top layer of material while leaving the backing intact. One of the most common challenges is setting the correct cutting depth. If the blade is not lowered enough, the vinyl material may not separate properly; if it goes too deep, it can cut through the backing layer and potentially damage the cutting mat. The cutting depth on the vinyl cutter machines typically does not exceed 1 mm. Another frequent issue is the mismatch between the blade and the type of material being processed. Using an inappropriate blade can lead to uneven cuts, premature dulling of the edge, and torn or frayed material. The overall quality of the output also depends on factors such as the cutting speed, blade sharpening and cutting angle, and the material the knife is made of.

Media Block

A Media Block or Integrated Media Block (IMB) is a component in a digital cinema projection system. Its purpose is to convert the Digital Cinema Package (DCP) content into data that ultimately produces picture and sound in a theater in compliance with DCI anti-piracy encryption requirements. == Terminology == DCI specification allows for two different security system architectures. In the first the Media Block is outside of the projector. This design is simply referred to as a "Media Block" and is typically a device attached directly to the motherboard of a Digital Cinema server. The media block is usually connected to the projector by dual-link SDI cables. Such media block is limited to processing 2K output, downscaling 4K DCPs if necessary. The second architecture describes an "Integrated Media Block". This refers to a device attached and integrated directly into the projector, which receives image data from the server, usually via a cat6 Ethernet connection. They can process 2K and 4K output. Some hardware implementations integrate the entire server on a single board and are able to work both as a MB as well as an IMB. == Security features == All security functions are contained within a Secure Processing Block (SPB), a tamper-proof physical device. Upon ingestion into a DCP server, Key Delivery Messages (KDM) are stored on flash memory in the media block or IMB. A KDM is written to enable the playback of a specific DCP during a specific time window and on a specific media block or IMB, identified by its serial number during the authoring process. Media blocks and IMBs also contain a secure clock that is set in the factory cannot be altered by the end user, which the DCP servers to which they are attached use to determine showtimes. The secure clock prevents theaters from showing encrypted movies outside the times authorized by the KDM (e.g. after it has expired) by simply changing the date and time in the server's BIOS. Media blocks and IMBs also typically include anti-tamper devices, designed to self-destruct the unit if unauthorized modification of its hardware, software or secure clock is attempted.

Image texture

An image texture is the small-scale structure perceived on an image, based on the spatial arrangement of color or intensities. It can be quantified by a set of metrics calculated in image processing. Image texture metrics give us information about the whole image or selected regions. Image textures can be artificially created or found in natural scenes captured in an image. Image textures are one way that can be used to help in segmentation or classification of images. For more accurate segmentation the most useful features are spatial frequency and an average grey level. To analyze an image texture in computer graphics, there are two ways to approach the issue: structured approach and statistical approach. == Structured approach == A structured approach sees an image texture as a set of primitive texels in some regular or repeated pattern. This works well when analyzing artificial textures. To obtain a structured description a characterization of the spatial relationship of the texels is gathered by using Voronoi tessellation of the texels. == Statistical approach == A statistical approach sees an image texture as a quantitative measure of the arrangement of intensities in a region. In general this approach is easier to compute and is more widely used, since natural textures are made of patterns of irregular subelements. === Edge detection === The use of edge detection is to determine the number of edge pixels in a specified region, helps determine a characteristic of texture complexity. After edges have been found the direction of the edges can also be applied as a characteristic of texture and can be useful in determining patterns in the texture. These directions can be represented as an average or in a histogram. Consider a region with N pixels. the gradient-based edge detector is applied to this region by producing two outputs for each pixel p: the gradient magnitude Mag(p) and the gradient direction Dir(p). The edgeness per unit area can be defined by F e d g e n e s s = | { p | M a g ( p ) > T } | N {\displaystyle F_{edgeness}={\frac {|\{p|Mag(p)>T\}|}{N}}} for some threshold T. To include orientation with edgeness histograms for both gradient magnitude and gradient direction can be used. Hmag(R) denotes the normalized histogram of gradient magnitudes of region R, and Hdir(R) denotes the normalized histogram of gradient orientations of region R. Both are normalized according to the size NR Then F m a g , d i r = ( H m a g ( R ) , H d i r ( R ) ) {\displaystyle F_{mag,dir}=(H_{mag}(R),H_{dir}(R))} is a quantitative texture description of region R. === Co-occurrence matrices === The co-occurrence matrix captures numerical features of a texture using spatial relations of similar gray tones. Numerical features computed from the co-occurrence matrix can be used to represent, compare, and classify textures. The following are a subset of standard features derivable from a normalized co-occurrence matrix: A n g u l a r 2 n d M o m e n t = ∑ i ∑ j p [ i , j ] 2 C o n t r a s t = ∑ i = 1 N g ∑ j = 1 N g n 2 p [ i , j ] , where | i − j | = n C o r r e l a t i o n = ∑ i = 1 N g ∑ j = 1 N g ( i j ) p [ i , j ] − μ x μ y σ x σ y E n t r o p y = − ∑ i ∑ j p [ i , j ] l n ( p [ i , j ] ) {\displaystyle {\begin{aligned}Angular{\text{ }}2nd{\text{ }}Moment&=\sum _{i}\sum _{j}p[i,j]^{2}\\Contrast&=\sum _{i=1}^{Ng}\sum _{j=1}^{Ng}n^{2}p[i,j]{\text{, where }}|i-j|=n\\Correlation&={\frac {\sum _{i=1}^{Ng}\sum _{j=1}^{Ng}(ij)p[i,j]-\mu _{x}\mu _{y}}{\sigma _{x}\sigma _{y}}}\\Entropy&=-\sum _{i}\sum _{j}p[i,j]ln(p[i,j])\\\end{aligned}}} where p [ i , j ] {\displaystyle p[i,j]} is the [ i , j ] {\displaystyle [i,j]} th entry in a gray-tone spatial dependence matrix, and Ng is the number of distinct gray-levels in the quantized image. One negative aspect of the co-occurrence matrix is that the extracted features do not necessarily correspond to visual perception. It is used in dentistry for the objective evaluation of lesions [DOI: 10.1155/2020/8831161], treatment efficacy [DOI: 10.3390/ma13163614; DOI: 10.11607/jomi.5686; DOI: 10.3390/ma13173854; DOI: 10.3390/ma13132935] and bone reconstruction during healing [DOI: 10.5114/aoms.2013.33557; DOI: 10.1259/dmfr/22185098; EID: 2-s2.0-81455161223; DOI: 10.3390/ma13163649]. === Laws texture energy measures === Another approach is to use local masks to detect various types of texture features. Laws originally used four vectors representing texture features to create sixteen 2D masks from the outer products of the pairs of vectors. The four vectors and relevant features were as follows: L5 = [ +1 +4 6 +4 +1 ] (Level) E5 = [ -1 -2 0 +2 +1 ] (Edge) S5 = [ -1 0 2 0 -1 ] (Spot) R5 = [ +1 -4 6 -4 +1 ] (Ripple) To these 4, a fifth is sometimes added: W5 = [ -1 +2 0 -2 +1 ] (Wave) From Laws' 4 vectors, 16 5x5 "energy maps" are then filtered down to 9 in order to remove certain symmetric pairs. For instance, L5E5 measures vertical edge content and E5L5 measures horizontal edge content. The average of these two measures is the "edginess" of the content. The resulting 9 maps used by Laws are as follows: L5E5/E5L5 L5R5/R5L5 E5S5/S5E5 S5S5 R5R5 L5S5/S5L5 E5E5 E5R5/R5E5 S5R5/R5S5 Running each of these nine maps over an image to create a new image of the value of the origin ([2,2]) results in 9 "energy maps," or conceptually an image with each pixel associated with a vector of 9 texture attributes. === Autocorrelation and power spectrum === The autocorrelation function of an image can be used to detect repetitive patterns of textures. == Texture segmentation == The use of image texture can be used as a description for regions into segments. There are two main types of segmentation based on image texture, region based and boundary based. Though image texture is not a perfect measure for segmentation it is used along with other measures, such as color, that helps solve segmenting in image. === Region based === Attempts to group or cluster pixels based on texture properties. === Boundary based === Attempts to group or cluster pixels based on edges between pixels that come from different texture properties.

UCSD Pascal

UCSD Pascal is a Pascal programming language system that runs on the UCSD p-System, a portable, highly machine-independent operating system. UCSD Pascal was first released in 1977. It was developed at the University of California, San Diego (UCSD). == The p-System == In 1977, the University of California, San Diego (UCSD) Institute for Information Systems developed UCSD Pascal to provide students with a common environment that could run on any of the then available microcomputers as well as campus DEC PDP-11 minicomputers. The operating system became known as UCSD p-System. There were three operating systems that IBM offered for its original IBM PC: the UCSD p-System, CP/M-86, and IBM PC DOS. Vendor SofTech Microsystems emphasized p-System's application portability, with virtual machines for 20 CPUs as of the IBM PC's release. It predicted that users would be able to use applications they purchased on future computers running p-System; advertisements called it "the Universal Operating System". PC Magazine denounced UCSD p-System on the IBM PC, stating in a review of Context MBA, written in the language, that it "simply does not produce good code". The p-System did not sell very well for the IBM PC, because of a lack of applications and because it was more expensive than the other choices. Previously, IBM had offered the UCSD p-System as an option for IBM Displaywriter, an 8086-based dedicated word processing machine. (The Displaywriter's native operating system had been developed completely internally and was not opened for end-user programming.) Notable extensions to standard Pascal include separately compilable Units and a String type. Some intrinsics were provided to accelerate string processing (e.g. scanning in an array for a particular search pattern); other language extensions were provided to allow the UCSD p-System to be self-compiling and self-hosted. UCSD Pascal was based on a p-code machine architecture. Its contribution to these early virtual machines was to extend p-code away from its roots as a compiler intermediate language into a full execution environment. The UCSD Pascal p-Machine was optimized for the new small microcomputers with addressing restricted to 16-bit (only 64 KB of memory). James Gosling cites UCSD Pascal as a key influence (along with the Smalltalk virtual machine) on the design of the Java virtual machine. UCSD p-System achieved machine independence by defining a virtual machine, called the p-Machine (or pseudo-machine, which many users began to call the "Pascal-machine" like the OS—although UCSD documentation always used "pseudo-machine") with its own instruction set called p-code (or pseudo-code). Urs Ammann, a student of Niklaus Wirth, originally presented a p-code in his PhD thesis, from which the UCSD implementation was derived, the Zurich Pascal-P implementation. The UCSD implementation changed the Zurich implementation to be "byte oriented". The UCSD p-code was optimized for execution of the Pascal programming language. Each hardware platform then only needed a p-code interpreter program written for it to port the entire p-System and all the tools to run on it. Later versions also included additional languages that compiled to the p-code base. For example, Apple Computer offered a Fortran Compiler (written by Silicon Valley Software, Sunnyvale California) producing p-code that ran on the Apple version of the p-system. Later, TeleSoft (also located in San Diego) offered an early Ada development environment that used p-code and was therefore able to run on a number of hardware platforms including the Motorola 68000, the System/370, and the Pascal MicroEngine. UCSD p-System shares some concepts with the later Java platform. Both use a virtual machine to hide operating system and hardware differences, and both use programs written to that virtual machine to provide cross-platform support. Likewise both systems allow the virtual machine to be used either as the complete operating system of the target computer or to run in a "box" under another operating system. The UCSD Pascal compiler was distributed as part of a portable operating system, the p-System. == History == UCSD p-System began around 1974 as the idea of UCSD's Kenneth Bowles, who believed that the number of new computing platforms coming out at the time would make it difficult for new programming languages to gain acceptance. He based UCSD Pascal on the Pascal-P2 release of the portable compiler from Zurich. He was particularly interested in Pascal as a language to teach programming. UCSD introduced two features that were important improvements on the original Pascal: variable length strings, and "units" of independently compiled code (an idea included into the then-evolving Ada (programming language)). Niklaus Wirth credits the p-System, and UCSD Pascal in particular, with popularizing Pascal. It was not until the release of Turbo Pascal that UCSD's version started to slip from first place among Pascal users. The Pascal dialect of UCSD Pascal came from the subset of Pascal implemented in Pascal-P2, which was not designed to be a full implementation of the language, but rather "the minimum subset that would self-compile", to fit its function as a bootstrap kit for Pascal compilers. UCSD added strings from BASIC, and several other implementation dependent features. Although UCSD Pascal later obtained many of the other features of the full Pascal language, the Pascal-P2 subset persisted in other dialects, notably Borland Pascal, which copied much of the UCSD dialect. == Versions == There were four versions of UCSD p-code engine, each with several revisions of the p-System and UCSD Pascal. A revision of the p-code engine (i.e., the p-Machine) meant a change to the p-code language, and therefore compiled code is not portable between different p-Machine versions. Each revision was represented with a leading Roman Numeral, while operating system revisions were enumerated as the "dot" number following the p-code Roman Numeral. For example, II.3 represented the third revision of the p-System running on the second revision of the p-Machine. === Version I === Original version, never officially distributed outside of the University of California, San Diego. However, the Pascal sources for both Versions I.3 and I.5 were freely exchanged between interested users. Specifically, the patch revision I.5a was known to be one of the most stable. === Version II === Widely distributed, available on many early microcomputers. Numerous versions included Apple II ultimately Apple Pascal, DEC PDP-11, Intel 8080, Zilog Z80, and MOS 6502 based machines, Motorola 68000 and the IBM PC (Version II on the PC was restricted to one 64K code segment and one 64K stack/heap data segment; Version IV removed the code segment limit but cost a lot more). Project members from this era include Dr Kenneth L Bowles, Mark Allen, Richard Gleaves, Richard Kaufmann, Pete Lawrence, Joel McCormack, Mark Overgaard, Keith Shillington, Roger Sumner, and John Van Zandt. === Version III === Custom version written for Western Digital to run on their Pascal MicroEngine microcomputer. Included support for parallel processes for the first time. === Version IV === Commercial version, developed and sold by SofTech. Based on Version II; did not include changes from Version III. Did not sell well due to combination of their pricing structure, performance problems due to p-code interpreter, and competition with native operating systems (on top of which it often ran). After SofTech dropped the product, it was picked up by Pecan Systems, a relatively small company formed of p-System users and fans. Sales revived somewhat, due mostly to Pecan's reasonable pricing structure, but the p-System and UCSD Pascal gradually lost the market to native operating systems and compilers. Available for the TI-99/4A equipped with p-code card, Commodore CBM 8096, Sage II/IV, HP 9000, and BBC Micro with 6502 second processor. == Further use == The Corvus Systems computer used UCSD Pascal for all its user software. The "innovative concept" of the Constellation OS was to run Pascal (interpretively or compiled) and include all common software in the manual, so users could modify as needed.