publications
2023
- PLDI’ 23Covering All the Bases: Type-based Verification of Test Input GeneratorsZhe Zhou, Ashish Mishra, Benjamin Delaware, and 1 more authorIn Accepted PLDI ’23, 2023
Test input generators are an important part of property-based testing (PBT) frameworks. Because PBT is intended to test deep semantic and structural properties of a program, the outputs produced by these generators can be complex data structures, constrained to satisfy properties the developer believes is most relevant to testing the function of interest. An important feature expected of these generators is that they be capable of producing all acceptable elements that satisfy the function’s input type and generator-provided constraints. However, it is not readily apparent how we might validate whether a particular generator’s output satisfies this coverage requirement. Typically, developers must rely on manual inspection and post-mortem analysis of test runs to determine if the generator is providing sufficient coverage; these approaches are error-prone and difficult to scale as generators become more complex. To address this important concern, we present a new refinement type-based verification procedure for validating the coverage provided by input test generators, based on a novel interpretation of types that embeds “must-style” underapproximate reasoning principles as a fundamental part of the type system. The types associated with expressions now capture the set of values guaranteed to be produced by the expression, rather than the typical formulation that uses types to represent the set of values an expression may produce. Beyond formalizing the notion of coverage types in the context of a rich core language with higher-order procedures and inductive datatypes, we also present a detailed evaluation study to justify the utility of our ideas.
- ECOOP’ 23Morpheus: Automated Safety Verification of Data-dependent Parser Combinator ProgramAshish Mishra, and Suresh JagannathanIn Accepted, ECOOP’ 23, 2023
Parser combinators are a well-known mechanism used for the compositional construction of parsers, and have shown to be particularly useful in writing parsers for rich grammars with data-dependencies and global state. Verifying applications written using them, however, has proven to be challenging in large part because of the inherently effectful nature of the parsers being composed and the difficulty in reasoning about the arbitrarily rich data-dependent semantic actions that can be associated with parsing actions. In this paper, we address these challenges by defining a parser combinator framework called Morpheus equipped with abstractions for defining composable effects tailored for parsing and semantic actions, and a rich specification language used to define safety properties over the constituent parsers comprising a program. Even though its abstractions yield many of the same expressivity benefits as other parser combinator systems, Morpheus is carefully engineered to yield a substantially more tractable automated verification pathway. We demonstrate its utility in verifying a number of realistic, challenging parsing applications, including several cases that involve non-trivial data-dependent relations.
2022
- OOPSLA’ 22Specification-Guided Component-Based Synthesis from Effectful LibrariesA Mishra, and Suresh JagannathanProc. ACM Program. Lang., Oct 2022
Component-based synthesis seeks to build programs using the APIs provided by a set of libraries. Oftentimes, these APIs have effects, which make it challenging to reason about the correctness of potential synthesis candidates. This is because changes to global state made by effectful library procedures affect how they may be composed together, yielding an intractably large search space that can confound typical enumerative synthesis techniques. If the nature of these effects are exposed as part of their specification, however, deductive synthesis approaches can be used to help guide the search for components. In this paper, we present a new specification-guided synthesis procedure that uses Hoare-style pre- and post-conditions to express fine-grained effects of potential library component candidates to drive a bi-directional synthesis search strategy. The procedure alternates between a forward search process that seeks to build larger terms given an existing context but which is otherwise unaware of the actual goal, alongside a backward search mechanism that seeks terms consistent with the desired goal but which is otherwise unaware of the context from which these terms must be synthesized. To further improve efficiency and scalability, we integrate a conflict-driven learning procedure into the synthesis algorithm that provides a semantic characterization of previously encountered unsuccessful search paths that is used to prune the space of possible candidates as synthesis proceeds. We have implemented our ideas in a tool called and demonstrate its effectiveness on a number of challenging synthesis problems defined over OCaml libraries equipped with effectful specifications.
- EMSE, ICSE ’23Stubbifier: Debloating Dynamic Server-Side JavaScript ApplicationsAlexi Turcotte, Ellen Arteca, Ashish Mishra, and 2 more authorsEmpirical Softw. Engg., Sep 2022
JavaScript is an increasingly popular language for server-side development, thanks in part to the Node.js runtime environment and its vast ecosystem of modules. With the Node.js package manager npm, users are able to easily include external modules as dependencies in their projects. However, npm installs modules with all of their functionality, even if only a fraction is needed, which causes an undue increase in code size. Eliminating this unused functionality from distributions is desirable, but the sound analysis required to find unused code is difficult due to JavaScript’s extreme dynamicity. We present a fully automatic technique that identifies unused code by constructing static or dynamic call graphs from the application’s tests, and replacing code deemed unreachable with either file- or function-level stubs. Due to JavaScript’s highly dynamic nature, call graph construction may suffer from unsoundness, i.e., code identified as unused may in fact be reachable. To handle such cases, if a stub is called, it will fetch and execute the original code on-demand to preserve the application’s behavior. The technique also provides an optional guarded execution mode to guard application against injection vulnerabilities in untested code that resulted from stub expansion. This technique is implemented in an open source tool called Stubbifier, designed to help package developers to produce a minimal production distribution. Stubbifier supports the ECMAScript 2019 standard. In an empirical evaluation on 15 Node.js applications and 75 clients of these applications, Stubbifier reduced application size by 56% on average while incurring only minor performance overhead. The evaluation also shows that Stubbifier’s guarded execution mode is capable of preventing several known injection vulnerabilities that are manifested in stubbed-out code. Finally, Stubbifier can work alongside bundlers, popular JavaScript tools for bundling an application with its dependencies. For the considered subject applications, we measured an average size reduction of 37% in bundled distributions.
2021
2018
- ThesisTypestates and Beyond: Verifying Rich Behavioral Properties Over Complex ProgramsAshish Mishra2018Dissertation, IISc
Statically verifying behavioral properties of programs is an important research problem. An efficient solution to this problem will have visible effects over multiple domains, ranging from program development, program debugging, program correction and verification, etc. Type systems are the most prevalently used static, light-weight verification systems for verifying certain properties of programs. Unfortunately, simple types are inadequate at verifying many behavioral/dynamic properties of programs. Typestates can tame this inadequacy of simple types by associating each type in a programming language with a state information. However, there are two major challenges in statically analyzing and verifying typestate properties over programs. The first challenge may be attributed to \increasing complexity of programs". The original work on typestates can only verify/analyze a typestate property over very simple programs which lacked dynamic memory allocation or aliasing. Subsequently, the following works on typestates extended and improvised the analysis over programs with aliasing and heaps. However, the state-of-the-art static typestate analysis works still cannot handle formidably rich programming features like asynchrony, library calls and callbacks, concurrency, etc. The second challenge may be attributed to \complexity of the property being verified". The original and the current notion of typestates can only verify a property de nable through a finite-state abstraction. This makes the state-of-the-art typestate analysis and verification works inadequate to verify useful but richer non-regular program properties. For example, using classical typestates we can verify a property like, \pop be called on a stack only after a push operation", but we cannot verify a non-regular program property like, \number of push operations should be at least equal to the number of pop operations". Currently, these behavioral properties are mostly verified/enforced by programmers at runtime via explicit checks. Unfortunately, these runtime checks are costly, error-prone, and lay an extra burden on the programmer. In this thesis we take small steps towards tackling both these challenges. Addressing complex program features, we present an asynchrony-aware static analysis, taking Android applications as our use case. Android applications have convoluted control flow, and complex features like asynchronous inter-component communications, library callbacks, Android enforced control flows (called as lifecycles), resource XMLs, which de ne and register event-handlers etc. Unfortunately, none of the available static analysis works for Android soundly captures all these features. We provide a formal semantics for Android asynchronous control flow, capturing these features. We use this semantics to introduce an intermediate program representation for Android applications, called the Android Inter-Component Control Flow Graph (AICCFG), and develop an asynchrony-aware interprocedural static analysis framework for Android applications. We use this framework to develop a static typestate analysis to capture Android resource API usage protocol violations. We present a set of benchmark applications for different resource types, and empirically compare our typestate analysis with the state-of-the-art synchronous static analyses for Android applications. Addressing the challenges associated with increasing complexity of properties, we present an expressive notion of typestates called, Parameterized typestates (p-typestates). p-typestates, associate an extra Pressburger de nable property along with states of regular typestates. This allows p-typestates to express many useful non-regular properties. We formally de ne this notion of p-typestates, and a p-typestate property automaton, to represent a p-typestate property. We present a dependent type system for these parameterized typestates and present a simple typestate-oriented language incorporating p-typestates. Further, typechecking such rich p-typestate properties require a programmer to provide invariants for loops and recursive structures. Unfortunately, providing these invariants is a non-trivial task even for expert programmers. To solve this problem, we present a simple and novel loop invariant calculation approach for Pressburger de nable systems. We encode a number of real-world programs in our dependently-typed language and use p-typestates type system, and loop-invariant calculation to verify several rich properties which are not amenable to regular typestate analysis and verification. Finally we discuss several possible extensions of our thesis along both these directions
2017
- ArXivPresburger-Definable Parameterized TypestatesAshish Mishra, Deepak D’Souza, and Y. N. SrikantCoRR, 2017
Typestates are good at capturing dynamic states of a program as compared to normal types that can capture static structural properties of data and program. Although useful, typestates are suitable only for specifying and verifying program properties defined using finite-state abstractions. Many useful dynamic properties of programs are not finite-state definable. To address these issues, we introduce parameterized typestates (p-typestates). p-typestates associate a logical property with each state of regular typestate, thereby allowing specification of properties beyond finite-state abstractions. We present a dependent type system to express and verify p-typestate properties and a typestate-oriented core programming language incorporating these dependent types. Automatic inductive type-checking of p-typestate properties usually requires a programmer to provide loop invariants as annotations. Here we propose a way to calculate loop invariants automatically, using loop acceleration techniques for Presburger definable transition systems.
- ECOOP 17, DSAnalysis and verification of rich typestate properties for complex programsAshish Mishra, and Y. N. Srikant2017ECOOP’ 2017 Doctoral Symposium
Typestates are useful programming language concepts to model software protocols. In this thesis we provide programming language based approaches for analysis, checking and verification of rich typestate properties over complex programs. Firstly, we use known typestate analysis approaches to capture crucial API protocol violations in Android applications. The complex control flow semantics of these programs makes the task challenging, while the excessive usage of resources and other APIs by Android makes it important. Secondly, we tackle the expressive limitations associated with typestates, and present a generalized notion of typestate using the expressive power of dependent types. These expressive typestates, which we term as Beyond-Regular Typestate (BR-typestate), are expressive enough to model many important non-regular properties (typestates can only express regular program properties), and yet have decidable type-checking, and even a decidable type-inference in certain cases. We further present a practical typestate oriented, dependently typed language incorporating these BR-typestates and present soundness results about the type system. For both the parts of the work, we create prototype systems to empirically evaluate the concepts discussed.
2016
- MEMOCODE ’16Asynchrony-aware static analysis of Android applicationsAshish Mishra, Aditya Kanade, and Y. N. SrikantIn 2016 ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE), 2016