Finally make a clean sweep and delete / organize old files. Add skeleton for LaTeX formal writeup in doc/ and change license (since this is all new code from the past few years) to BSD-2-Clause-Patent

This commit is contained in:
Nathan Braswell
2022-01-30 16:57:21 -05:00
parent 315ae20698
commit 7f220c97b8
325 changed files with 901 additions and 31024 deletions

1
.gitignore vendored
View File

@@ -31,3 +31,4 @@ bootstrap_kalypso
kraken_bootstrap
compiler_version.krak
untracked_misc
k_prime

View File

@@ -1,41 +0,0 @@
Kraken Compiled Grammer file format (.kgm.comp)
This file is generated on first run, and regenerated everytime the grammer changes.
It contains the RNGLR table generated from the specified grammer so that it does not
have to be remade every time Kraken is run, saving a lot of time.
(at time of writing, non-cached: ~30 seconds, cached: <1 second)
This is a binary format. The first bytes are a magic number (KRAK in asci)
The next bytes are an unsigned integer indicating how many characters follow.
Next are these characters, which are the grammer file as one long string.
Next is the parse table length, followed by the table itself, exported with the table's export method.
It can be imported with the import method.
Note that within the parse table's data are parse actions, and within that, Symbols.
The format: (more or less)
____________________
|KRAK
|length_of_grammer_text
|GRAMMER_TEXT
|PARSE_TABLE
|-|length_of_symbol_index_vector
|-|SYMBOL_INDEX_VECTOR
|-|length_of_out_table_vector
|-|OUT_TABLE_VECTOR
|-|-|length_of_mid_table_vector
|-|-|MID_TABLE_VECTOR
|-|-|-|length_of_in_table_vector
|-|-|-|IN_TABLE_VECTOR
|-|-|-|-|length_of_parse_action
|-|-|-|-|PARSE_ACTION
|-|-|-|-|-|ActionType
|-|-|-|-|-|ParseRule__if_exists
|-|-|-|-|-|-|pointerIndex
|-|-|-|-|-|-|Symbol_left_handel
|-|-|-|-|-|-|rightside_vector_symbol
|-|-|-|-|-|shiftState
____________________

View File

@@ -1,21 +1,19 @@
The MIT License (MIT)
Copyright (c) 2020-2022 Nathan Braswell
Copyright (c) 2014-2016 Nathan Christopher Braswell, Google Inc.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Subject to the terms and conditions of this license, each copyright holder and contributor hereby grants to those receiving rights under this license a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except for failure to satisfy the conditions of this license) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer this software, where such license applies only to those patent claims, already acquired or hereafter acquired, licensable by such copyright holder or contributor that are necessarily infringed by:
(a) their Contribution(s) (the licensed copyrights of copyright holders and non-copyrightable additions of contributors, in source or binary form) alone; or
(b) combination of their Contribution(s) with the work of authorship to which such Contribution(s) was added by such copyright holder or contributor, if, at the time the Contribution is added, such addition causes such combination to be necessarily infringed. The patent license shall not apply to any other combinations which include the Contribution.
Except as expressly stated above, no rights or licenses from any copyright holder or contributor is granted under this license, whether expressly, by implication, estoppel or otherwise.
DISCLAIMER
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@@ -3,30 +3,21 @@ Kraken
The Kraken Programming Language
(try it out online at http://www.kraken-lang.org/)
(more information online at http://www.kraken-lang.org/ which is also under construction / needs to be updated / has a try-it-online feature for an older version without partial evaluation)
(vim integration (filetype, syntax highlighting, Syntastic) at https://github.com/Limvot/kraken.vim)
(emacs integration (filetype, syntax highlighting) at https://github.com/Limvot/kraken-mode)
Currently developing the third iteration, a Scheme-like based on a functional Vau calculus partially-evaluated for efficency and compiling to WebAssembly.
The Kraken Programming Language is functional but very much still in development.
It has both the normal features you might expect of a modern language, (functions, variables, an object system, dynamic memory), as well as some more advanced ones (mutually recursive definitions, lambdas/closures, algebraic data types, templates, marker traits, defer statements, etc).
*Heavily* inspiried by John Shutt's thesis: https://web.wpi.edu/Pubs/ETD/Available/etd-090110-124904/unrestricted/jshutt.pdf
with partial evaluation during compilation to make it efficient.
Kraken can either compile to C or its own bytecode which is then interpreted.
Dependencies
============
Kraken is self-hosted - in order to build it, a script is included that will compile the original C++ version (which depends on CMake) and then checks out each necessary version to compile up to the current one. This can take quite a while - when it hits 1.0 I am planning on removing the old C++ version and checking in a pre-compiled-to-c version to use for further bootstrapping.
Licensed under
SPDX-License-Identifier: BSD-2-Clause-Patent
Note: This license is designed to provide: a) a simple permissive license; b) that is compatible with the GNU General Public License (GPL), version 2; and c) which also has an express patent grant included.
(Note taken from https://opensource.org/licenses/BSDplusPatent )
Goals
=====
It has the following design goals:
* Compiled
* Clean
* Fast (both running and writing)
* Good for Systems (including Operating Systems) programming
* Very powerful libraries (say, a library that allows you import from automatically parsed C header files)
* Minimal "magic" code. (no runtime, other libraries automatically included)
It is inspired by C, Kotlin, Rust, and Jai.

View File

@@ -1,147 +0,0 @@
#!/usr/bin/env bash
kraken="kraken"
bootstrap_commits=(cf46fb13afe66ba475db9725e9269c9c1cd3bbc3 2cd43e5a217318c70097334b3598d2924f64b362 2051f54b559ac5edf67277d4f1134aca2cb9215d ecbbcb4eda56e2467efb0a04e7d668b95856aa4b d126cbf24ba8b26e3814e2260d555ecaee86508c 947384cced5397a517a71963edc8f47e668d734f cfcaff7887a804fe77dadaf2ebb0251d6e8ae8e2 12dfa837e31bf09adb1335219473b9a7e6db9eac acb0e48324f353d30d148eb11d1bf2843d83b51a 29eff2a23e5c8afc59dc71a9ecd74cedbd5663c3 0f2ac1421a4da5ff63a2df94efa2bcb37eec40b8 f71b5f3576b5ddbb19b8df4e5d786f0147160c13 fb63eee9e8a38a9df68903ec9acac7408aebc824 6f659ece49debe79b9f1a0b272ab7cce14d84c85 c0209118e5c06a9f03bb56d032aeccbc28bfbf73 5b46089694d9c51cc302c8dbb952495f3e6301c6)
if ! [ -s "cached_builds" ]
then
mkdir cached_builds
fi
if [[ $1 == "clean" ]]
then
rm ${kraken}
rm ${kraken}_bac
rm ${kraken}_deprecated
rm ${kraken}_bootstrap
rm -rf bootstrap_kalypso
else
if [[ $1 == "backup" ]]
then
rm ${kraken}
fi
if [[ $1 == "from_old" ]]
then
rm ${kraken}
rm ${kraken}_bac
rm ${kraken}_deprecated
fi
if [[ $1 == "rebuild" ]]
then
rm ${kraken}
rm ${kraken}_bac
rm ${kraken}_deprecated
rm ${kraken}_bootstrap
rm -rf bootstrap_kalypso
fi
if [ -s "$kraken" ]
then
#echo "$kraken exists, calling"
./${kraken} ${kraken}.krak ${kraken}
else
echo "gotta make $kraken, testing for compilers to do so"
if ! [ -s "${kraken}_bac" ]
then
if ! [ -s "${kraken}_deprecated" ]
then
echo "no ${kraken}_deprecated, bootstrapping using kraken_bootstrap"
if ! [ -s "${kraken}_bootstrap" ]
then
# Check to see if we have a chached version
cached_index=0
for ((i=1; i < ${#bootstrap_commits[@]}; i++))
do
echo "checking for cached kalypso part $i"
echo "commit hash: ${bootstrap_commits[$i]}"
if [ -s "cached_builds/${bootstrap_commits[i]}" ]
then
cached_index=$i
echo "have cached: ${bootstrap_commits[$i]}"
else
echo "do not have cached: ${bootstrap_commits[$i]}"
fi
done
git clone . bootstrap_kalypso
pushd bootstrap_kalypso
if [[ $cached_index == "0" ]]
then
echo "no ${kraken}_bootstrap, bootstrapping using Cephelpod and a chain of old Kalypsos"
git checkout ${bootstrap_commits[0]}
cp -r stdlib deprecated_compiler
cp krakenGrammer.kgm deprecated_compiler
cp kraken.krak deprecated_compiler
pushd deprecated_compiler
mkdir build
pushd build
cmake ..
make
popd
mkdir build_kraken
mv kraken.krak build_kraken
pushd build_kraken
../build/kraken kraken.krak
popd
popd
pushd deprecated_compiler/build_kraken/kraken
sh kraken.sh
popd
cp deprecated_compiler/build_kraken/kraken/kraken ./${kraken}_bootstrap
else
echo "no ${kraken}_bootstrap, bootstrapping using starting from cached version"
git checkout ${bootstrap_commits[$cached_index]}
cp "../cached_builds/${bootstrap_commits[$cached_index]}/kraken.krak.c" "./"
cc kraken.krak.c -lm -lpthread -O3 -o kraken_bootstrap
fi
# loop through the chain
for ((i=$cached_index+1; i < ${#bootstrap_commits[@]}; i++))
do
echo "building kalypso bootstrap part $i"
echo "commit hash: ${bootstrap_commits[$i]}"
mv ./krakenGrammer.kgm krakenGrammer.kgm_old
git checkout ${bootstrap_commits[$i]}
echo "var version_string = \"BOOTSTRAPPING VERSION - Self-hosted Kraken compiler \\\"Kalypso\\\" - revision $(git rev-list HEAD | wc -l), commit: $(git rev-parse HEAD)\";" > compiler_version.krak
mv ./krakenGrammer.kgm krakenGrammer.kgm_new
mv ./krakenGrammer.kgm_old krakenGrammer.kgm
# Quick fix - I made a commit that actually depends on it's own grammer to be built
if [[ ${bootstrap_commits[$i]} == "12dfa837e31bf09adb1335219473b9a7e6db9eac" ]]
then
echo "Hot fixing mistake - using new grammer instead of old"
cp ./krakenGrammer.kgm_new krakenGrammer.kgm
fi
./${kraken}_bootstrap kraken.krak ${kraken}_bootstrap
mkdir "../cached_builds/${bootstrap_commits[$i]}"
cp "./kraken.krak.c" "../cached_builds/${bootstrap_commits[$i]}/"
mv ./krakenGrammer.kgm_new krakenGrammer.kgm
done
popd # out of bootstrap
fi
echo "making kraken_deprecated - the first current Kraken version, but built with an old compiler"
# Now make real
mv ./krakenGrammer.kgm krakenGrammer.kgm_new
mv ./krakenGrammer.kgm.comp_new krakenGrammer.kgm.comp_new_new
cp bootstrap_kalypso/krakenGrammer.kgm ./
cp bootstrap_kalypso/krakenGrammer.kgm.comp_new ./
cp bootstrap_kalypso/${kraken}_bootstrap ./${kraken}_bootstrap
./${kraken}_bootstrap kraken.krak ${kraken}_deprecated
mv ./krakenGrammer.kgm_new krakenGrammer.kgm
mv ./krakenGrammer.kgm.comp_new_new krakenGrammer.kgm.comp_new
else
echo "${kraken}_deprecated exists, calling"
fi
echo "making kraken_bac, a current compiler built with kraken_deprecated"
./${kraken}_deprecated kraken.krak ${kraken}_bac
else
echo "${kraken}_bac exists, calling"
fi
echo "making kraken, the real current compiler built with kraken_bac"
./${kraken}_bac kraken.krak ${kraken}
fi
fi
#./${kraken} $@

View File

@@ -1,2 +0,0 @@
#!/bin/sh
cloc --read-lang-def=kraken_cloc_definition.txt kraken.krak stdlib/

View File

@@ -1,5 +0,0 @@
build
build_kraken
krakenGrammer.kgm
krakenGrammer.kgm.comp
stdlib

View File

@@ -1,29 +0,0 @@
cmake_minimum_required (VERSION 2.6)
project(Kraken)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -g")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3")
set( MY_INCLUDES ${PROJECT_SOURCE_DIR}/include)
set( MY_SOURCES main.cpp src/Parser.cpp src/GraphStructuredStack.cpp
src/RNGLRParser.cpp src/ParseAction.cpp src/ParseRule.cpp src/Symbol.cpp
src/StringReader.cpp src/State.cpp src/util.cpp src/Lexer.cpp
src/RegEx.cpp src/RegExState.cpp src/Table.cpp src/ASTData.cpp
src/ASTTransformation.cpp src/CGenerator.cpp src/Type.cpp src/Importer.cpp
src/Tester.cpp src/CCodeTriple.cpp)
add_custom_target(STDLibCopy ALL)
add_custom_command(TARGET STDLibCopy POST_BUILD
COMMAND ${CMAKE_COMMAND} -E copy_directory
"${PROJECT_SOURCE_DIR}/stdlib"
"${PROJECT_BINARY_DIR}/stdlib")
include_directories( ${MY_INCLUDES} )
add_executable(kraken ${MY_SOURCES})

View File

@@ -1,41 +0,0 @@
#ifndef ASTDATA_H
#define ASTDATA_H
#include <vector>
#include <map>
#include <set>
#include "Symbol.h"
//Circular dependency
class Type;
#include "Type.h"
#ifndef NULL
#define NULL ((void*)0)
#endif
enum ASTType {undef, translation_unit, import, identifier, type_def, adt_def,
function, code_block, typed_parameter, expression, boolean_expression, statement,
if_statement, match_statement, case_statement, while_loop, for_loop, return_statement, break_statement,
continue_statement, defer_statement, assignment_statement, declaration_statement, if_comp, simple_passthrough,
passthrough_params, in_passthrough_params, out_passthrough_params, opt_string, param_assign, function_call, value};
class ASTData {
public:
ASTData();
ASTData(ASTType type, Type *valueType = NULL);
ASTData(ASTType type, Symbol symbol, Type *valueType = NULL);
~ASTData();
std::string toString();
static std::string ASTTypeToString(ASTType type);
ASTType type;
Type* valueType;
Symbol symbol;
std::map<std::string, std::vector<NodeTree<ASTData>*>> scope;
std::set<NodeTree<ASTData>*> closedVariables;
private:
};
#endif

View File

@@ -1,87 +0,0 @@
#ifndef ASTTRANSFORMATION_H
#define ASTTRANSFORMATION_H
#include <set>
#include <map>
#include <iterator>
#include <algorithm>
#include "Type.h"
#include "ASTData.h"
#include "NodeTransformation.h"
#include "Importer.h"
class Importer;
class ASTTransformation: public NodeTransformation<Symbol,ASTData> {
public:
ASTTransformation(Importer* importerIn);
~ASTTransformation();
NodeTree<Symbol>* getNode(std::string lookup, std::vector<NodeTree<Symbol>*> nodes);
NodeTree<Symbol>* getNode(std::string lookup, NodeTree<Symbol>* parent);
std::vector<NodeTree<Symbol>*> getNodes(std::string lookup, std::vector<NodeTree<Symbol>*> nodes);
std::vector<NodeTree<Symbol>*> getNodes(std::string lookup, NodeTree<Symbol>* parent);
//First pass defines all type_defs (objects and ailises)
NodeTree<ASTData>* firstPass(std::string fileName, NodeTree<Symbol>* parseTree);
std::set<std::string> parseTraits(NodeTree<Symbol>* traitsNode);
//Second pass defines data inside objects, outside declaration statements, and function prototpyes (since we have type_defs now)
void secondPass(NodeTree<ASTData>* ast, NodeTree<Symbol>* parseTree);
void secondPassDoClassInsides(NodeTree<ASTData>* typeDef, std::vector<NodeTree<Symbol>*> typedefChildren, std::map<std::string, Type*> templateTypeReplacements);
NodeTree<ASTData>* secondPassDeclaration(NodeTree<Symbol>* from, NodeTree<ASTData>* scope, std::map<std::string, Type*> templateTypeReplacements);
NodeTree<ASTData>* secondPassFunction(NodeTree<Symbol>* from, NodeTree<ASTData>* scope, std::map<std::string, Type*> templateTypeReplacements);
//The third pass does all the function bodies
void thirdPass(NodeTree<ASTData>* ast, NodeTree<Symbol>* parseTree);
NodeTree<ASTData>* searchScopeForFunctionDef(NodeTree<ASTData>* scope, NodeTree<Symbol>* parseTree, std::map<std::string, Type*> templateTypeReplacements);
void thirdPassFunction(NodeTree<Symbol>* from, NodeTree<ASTData>* functionDef, std::map<std::string, Type*> templateTypeReplacements);
//The fourth pass finishes instantiation of templated objects
//it used to be a part of the third pass, but it was split out because it has to be done in a loop
//with all the other asts until none change anymore (it returns a bool if it instantiated a new one)
bool fourthPass(NodeTree<ASTData>* ast, NodeTree<Symbol>* parseTree);
virtual NodeTree<ASTData>* transform(NodeTree<Symbol>* from);
NodeTree<ASTData>* transform(NodeTree<Symbol>* from, NodeTree<ASTData>* scope, std::vector<Type> types, bool limitToFunction, std::map<std::string, Type*> templateTypeReplacements);
std::vector<NodeTree<ASTData>*> transformChildren(std::vector<NodeTree<Symbol>*> children, std::set<int> skipChildren, NodeTree<ASTData>* scope, std::vector<Type> types, bool limitToFunction, std::map<std::string, Type*> templateTypeReplacements);
std::string concatSymbolTree(NodeTree<Symbol>* root);
NodeTree<ASTData>* doFunction(NodeTree<ASTData>* scope, std::string lookup, std::vector<NodeTree<ASTData>*> nodes, std::map<std::string, Type*> templateTypeReplacements);
NodeTree<ASTData>* generateThis(NodeTree<ASTData>* scope);
std::set<NodeTree<ASTData>*> findVariablesToClose(NodeTree<ASTData>* func, NodeTree<ASTData>* stat, NodeTree<ASTData>* scope);
bool inScopeChain(NodeTree<ASTData>* node, NodeTree<ASTData>* scope);
NodeTree<ASTData>* functionLookup(NodeTree<ASTData>* scope, std::string lookup, std::vector<Type> types);
NodeTree<ASTData>* templateFunctionLookup(NodeTree<ASTData>* scope, std::string lookup, std::vector<Type*>* templateInstantiationTypes, std::vector<Type> types, std::map<std::string, Type*> scopeTypeMap);
std::vector<NodeTree<ASTData>*> scopeLookup(NodeTree<ASTData>* scope, std::string lookup, bool includeModules = false);
std::vector<NodeTree<ASTData>*> scopeLookup(NodeTree<ASTData>* scope, std::string lookup, bool includeModules, std::set<NodeTree<ASTData>*> visited);
NodeTree<ASTData>* getUpperTranslationUnit(NodeTree<ASTData>* node);
NodeTree<ASTData>* addToScope(std::string name, NodeTree<ASTData>* toAdd, NodeTree<ASTData>* addTo);
Type* typeFromTypeNode(NodeTree<Symbol>* typeNode, NodeTree<ASTData>* scope, std::map<std::string, Type*> templateTypeReplacements);
NodeTree<ASTData>* templateClassLookup(NodeTree<ASTData>* scope, std::string name, std::vector<Type*> templateInstantiationTypes);
void unifyType(NodeTree<Symbol> *syntaxType, Type type, std::map<std::string, Type>* templateTypeMap, std::map<std::string, Type*> typeMap);
void unifyTemplateFunction(NodeTree<ASTData>* templateFunction, std::vector<Type> types, std::vector<Type*>* templateInstantiationTypes, std::map<std::string, Type*> typeMap);
NodeTree<ASTData>* tryToFindOrInstantiateFunctionTemplate(std::string functionName, NodeTree<ASTData>* scope, std::vector<Type> types, std::map<std::string, Type*> templateTypeReplacements);
NodeTree<ASTData>* findOrInstantiateFunctionTemplate(std::string functionName, NodeTree<ASTData>* scope, std::vector<Type> types, std::map<std::string, Type*> templateTypeReplacements);
NodeTree<ASTData>* findOrInstantiateFunctionTemplate(std::vector<NodeTree<Symbol>*> children, NodeTree<ASTData>* scope, std::vector<Type> types, std::map<std::string, Type*> templateTypeReplacements);
NodeTree<ASTData>* findOrInstantiateFunctionTemplate(std::string functionName, std::vector<NodeTree<Symbol>*> children, NodeTree<ASTData>* scope, std::vector<Type> types, std::map<std::string, Type*> templateTypeReplacements);
std::map<std::string, Type*> makeTemplateFunctionTypeMap(NodeTree<Symbol>* templateNode, std::vector<Type*> types, std::map<std::string, Type*> scopeTypeMap);
std::vector<std::pair<std::string, std::set<std::string>>> makeTemplateNameTraitPairs(NodeTree<Symbol>* templateNode);
private:
Importer * importer;
NodeTree<ASTData>* builtin_trans_unit; // the top scope for language level stuff
std::map<std::string, std::vector<NodeTree<ASTData>*>> languageLevelReservedWords;
std::map<std::string, std::vector<NodeTree<ASTData>*>> languageLevelOperators;
std::map<NodeTree<ASTData>*, NodeTree<ASTData>*> this_map; // used to map implicit "this" variables to their type
NodeTree<ASTData>* topScope; //maintained for templates that need to add themselves to the top scope no matter where they are instantiated
int lambdaID = 0;
};
std::vector<Type> mapNodesToTypes(std::vector<NodeTree<ASTData>*> nodes);
std::vector<Type*> mapNodesToTypePointers(std::vector<NodeTree<ASTData>*> nodes);
#endif

View File

@@ -1,25 +0,0 @@
#ifndef CCODETRIPLE_H
#define CCODETRIPLE_H
#include <string>
#include <iostream>
#include "util.h"
class CCodeTriple {
public:
CCodeTriple(std::string pre, std::string val, std::string post);
CCodeTriple(std::string val);
CCodeTriple(const char* val);
CCodeTriple();
~CCodeTriple();
std::string oneString(bool endValue = false);
CCodeTriple & operator=(const CCodeTriple &rhs);
CCodeTriple & operator+=(const CCodeTriple &rhs);
std::string preValue;
std::string value;
std::string postValue;
private:
};
CCodeTriple operator+(const CCodeTriple &a, const CCodeTriple &b);
#endif //CCODETRIPLE_H

View File

@@ -1,72 +0,0 @@
#ifndef CGENERATOR_H
#define CGENERATOR_H
#include <string>
#include <iostream>
#include <fstream>
#include <utility>
#include <stack>
#include <sys/stat.h>
#include "CCodeTriple.h"
#include "NodeTree.h"
#include "ASTData.h"
#include "Type.h"
// for mapNodesToTypes
#include "ASTTransformation.h"
#include "util.h"
#include "Poset.h"
// Note the use of std::pair to hold two strings - the running string for the header file and the running string for the c file.
enum ClosureTypeSpecialType { ClosureTypeRegularNone, ClosureFunctionPointerTypeWithoutClosedParam, ClosureFunctionPointerTypeWithClosedParam };
class CGenerator {
public:
CGenerator();
~CGenerator();
int generateCompSet(std::map<std::string, NodeTree<ASTData>*> ASTs, std::string outputName);
std::string generateTypeStruct(NodeTree<ASTData>* from);
bool isUnderNodeWithType(NodeTree<ASTData>* from, ASTType type);
bool isUnderTranslationUnit(NodeTree<ASTData>* from, NodeTree<ASTData>* typeDefinition);
NodeTree<ASTData>* highestScope(NodeTree<ASTData>* node);
std::pair<std::string, std::string> generateTranslationUnit(std::string name, std::map<std::string, NodeTree<ASTData>*> ASTs);
CCodeTriple generate(NodeTree<ASTData>* from, NodeTree<ASTData>* enclosingObject = NULL, bool justFuncName = false, NodeTree<ASTData>* enclosingFunction = NULL);
std::string generateAliasChains(std::map<std::string, NodeTree<ASTData>*> ASTs, NodeTree<ASTData>* definition);
std::string closureStructType(std::set<NodeTree<ASTData>*> closedVariables);
std::string ValueTypeToCType(Type *type, std::string, ClosureTypeSpecialType closureSpecial = ClosureTypeRegularNone);
std::string ValueTypeToCTypeDecoration(Type *type, ClosureTypeSpecialType closureSpecial = ClosureTypeRegularNone);
std::string ValueTypeToCTypeThingHelper(Type *type, std::string ptrStr, ClosureTypeSpecialType closureSpecial);
static std::string CifyName(std::string name);
static std::string scopePrefix(NodeTree<ASTData>* from);
std::string simpleComplexName(std::string simpleName, std::string complexName);
std::string prefixIfNeeded(std::string prefix, std::string name);
std::string generateObjectMethod(NodeTree<ASTData>* enclosingObject, NodeTree<ASTData>* from, std::string *functionPrototype);
NodeTree<ASTData>* getMethodsObjectType(NodeTree<ASTData>* scope, std::string functionName);
NodeTree<ASTData>* getMethod(Type* type, std::string method, std::vector<Type> types);
bool methodExists(Type* type, std::string method, std::vector<Type> types);
std::string generateMethodIfExists(Type* type, std::string method, std::string parameter, std::vector<Type> methodTypes);
std::string emitDestructors(std::vector<NodeTree<ASTData>*> possibleDeclarations, NodeTree<ASTData>* enclosingObject);
std::string tabs();
std::string getID();
int tabLevel;
int id;
std::string function_header;
std::string generatorString;
std::string linkerString;
std::string functionTypedefString;
std::string functionTypedefStringPre;
std::set<std::string> usedNameSet;
std::map<std::string, std::string> simpleComplexNameMap;
std::map<Type, triple<std::string, std::string, std::string>> functionTypedefMap;
std::map<std::set<NodeTree<ASTData>*>, std::string> closureStructMap;
std::vector<std::vector<NodeTree<ASTData>*>> distructDoubleStack;
std::stack<int> loopDistructStackDepth;
std::vector<std::vector<NodeTree<ASTData>*>> deferDoubleStack;
std::stack<int> loopDeferStackDepth;
private:
};
#endif

View File

@@ -1,52 +0,0 @@
#ifndef COLLAPSETRANSFORMATION_H
#define COLLAPSETRANSFORMATION_H
#include <queue>
#include <vector>
#include "NodeTransformation.h"
template<class T>
class CollapseTransformation: public NodeTransformation<T,T> {
public:
CollapseTransformation(T toCollapse);
~CollapseTransformation();
virtual NodeTree<T>* transform(NodeTree<T>* from);
private:
T toCollapse;
};
#endif
template<class T>
CollapseTransformation<T>::CollapseTransformation(T toCollapse) {
this->toCollapse = toCollapse;
}
template<class T>
CollapseTransformation<T>::~CollapseTransformation() {
//
}
template<class T>
NodeTree<T>* CollapseTransformation<T>::transform(NodeTree<T>* from) {
std::queue<NodeTree<T>*> toProcess;
toProcess.push(from);
while(!toProcess.empty()) {
NodeTree<T>* node = toProcess.front();
toProcess.pop();
std::vector<NodeTree<T>*> children = node->getChildren();
for (int i = 0; i < children.size(); i++) {
if (children[i]->getData() == toCollapse) {
node->removeChild(children[i]);
std::vector<NodeTree<T>*> newChildren = children[i]->getChildren();
node->insertChildren(i,newChildren);
toProcess.push(node); //Do this node again
}
else
toProcess.push(children[i]);
}
}
return from;
}

View File

@@ -1,48 +0,0 @@
#ifndef DELETETRANSFORMATION_H
#define DELETETRANSFORMATION_H
#include <queue>
#include <vector>
#include "NodeTransformation.h"
template<class T>
class DeleteTransformation: public NodeTransformation<T,T> {
public:
DeleteTransformation(T toDelete);
~DeleteTransformation();
virtual NodeTree<T>* transform(NodeTree<T>* from);
private:
T toRemove;
};
#endif
template<class T>
DeleteTransformation<T>::DeleteTransformation(T toRemove) {
this->toRemove = toRemove;
}
template<class T>
DeleteTransformation<T>::~DeleteTransformation() {
//
}
template<class T>
NodeTree<T>* DeleteTransformation<T>::transform(NodeTree<T>* from) {
std::queue<NodeTree<T>*> toProcess;
toProcess.push(from);
while(!toProcess.empty()) {
NodeTree<T>* node = toProcess.front();
toProcess.pop();
std::vector<NodeTree<T>*> children = node->getChildren();
for (int i = 0; i < children.size(); i++) {
if (children[i]->getData() == toRemove)
node->removeChild(children[i]);
else
toProcess.push(children[i]);
}
}
return from;
}

View File

@@ -1,38 +0,0 @@
#include <iostream>
#include <vector>
#include <queue>
#include <map>
#include "NodeTree.h"
#include "Symbol.h"
#include "util.h"
#ifndef GRAPH_STRUCTURED_STACK
#define GRAPH_STRUCTURED_STACK
class GraphStructuredStack {
public:
GraphStructuredStack();
~GraphStructuredStack();
NodeTree<int>* newNode(int stateNum);
void addToFrontier(int frontier, NodeTree<int>* node);
NodeTree<int>* inFrontier(int frontier, int state);
int getContainingFrontier(NodeTree<int>* node);
bool frontierIsEmpty(int frontier);
NodeTree<int>* frontierGetAccState(int frontier);
std::vector<NodeTree<int>*>* getReachable(NodeTree<int>* start, int lenght);
std::vector<std::vector<NodeTree<int>*> >* getReachablePaths(NodeTree<int>* start, int lenght);
void recursivePathFind(NodeTree<int>* start, int length, std::vector<NodeTree<int>*> currentPath, std::vector<std::vector<NodeTree<int>*> >* paths);
bool hasEdge(NodeTree<int>* start, NodeTree<int>* end);
NodeTree<Symbol>* getEdge(NodeTree<int>* start, NodeTree<int>* end);
void addEdge(NodeTree<int>* start, NodeTree<int>* end, NodeTree<Symbol>* edge);
void clear();
std::vector<int> getFrontier(int frontier);
std::string toString();
private:
std::vector<std::vector<NodeTree<int>*>*> gss;
std::map< std::pair< NodeTree<int>*, NodeTree<int>* >, NodeTree<Symbol>* > edges;
std::map< NodeTree<int>*, int > containing_frontier_map;
};
#endif

View File

@@ -1,48 +0,0 @@
#ifndef __IMPORTER__H_
#define __IMPORTER__H_
#include <string>
#include <vector>
#include <iostream>
#include <fstream>
#include <sys/stat.h>
#include "Parser.h"
#include "NodeTree.h"
#include "ASTData.h"
#include "Symbol.h"
#include "RemovalTransformation.h"
#include "CollapseTransformation.h"
#include "ASTTransformation.h"
class ASTTransformation;
class Importer {
public:
Importer(Parser* parserIn, std::vector<std::string> includePaths, std::string outputName, bool only_parseIn = false);
~Importer();
void import(std::string fileName);
NodeTree<ASTData>* getUnit(std::string fileName);
NodeTree<ASTData>* importFirstPass(std::string fileName);
NodeTree<Symbol>* parseAndTrim(std::string fileName);
void registerAST(std::string name, NodeTree<ASTData>* ast, NodeTree<Symbol>* syntaxTree);
std::map<std::string, NodeTree<ASTData>*> getASTMap();
private:
std::string outputName;
ASTTransformation *ASTTransformer;
struct importTriplet {
std::string name;
NodeTree<ASTData>* ast;
NodeTree<Symbol>* syntaxTree;
};
bool only_parse;
std::vector<importTriplet> importedTrips;
std::vector<std::string> includePaths;
Parser* parser;
std::vector<Symbol> removeSymbols;
std::vector<Symbol> collapseSymbols;
std::map<std::string, NodeTree<ASTData>*> imported;
};
#endif

View File

@@ -1,26 +0,0 @@
#ifndef LEXER_H
#define LEXER_H
#include "util.h"
#include "StringReader.h"
#include "RegEx.h"
#include "Symbol.h"
#include <string>
class Lexer {
public:
Lexer();
Lexer(std::string inputString);
~Lexer();
void addRegEx(std::string regExString);
void setInput(std::string inputString);
Symbol next();
void reset();
static void test();
private:
std::vector<RegEx*> regExs;
std::string input;
int currentPosition;
};
#endif

View File

@@ -1,35 +0,0 @@
#ifndef NODETRANSFORMATION_H
#define NODETRANSFORMATION_H
#include "NodeTree.h"
#ifndef NULL
#define NULL ((void*)0)
#endif
template <class FROM, class TO>
class NodeTransformation {
public:
NodeTransformation();
virtual ~NodeTransformation();
virtual NodeTree<TO>* transform(NodeTree<FROM>* from)=0;
private:
};
template <class FROM, class TO>
NodeTransformation<FROM,TO>::NodeTransformation() {
//Nothing
}
template <class FROM, class TO>
NodeTransformation<FROM,TO>::~NodeTransformation() {
//Nothing
}
// template <class FROM, class TO>
// NodeTree<TO>* NodeTransformation<FROM,TO>::transform(NodeTree<FROM>* from) {
// return (NodeTree<TO>*)0x1234;
// }
#endif

View File

@@ -1,277 +0,0 @@
#ifndef NODETREE_H
#define NODETREE_H
#ifndef NULL
#define NULL ((void*)0)
#endif
#include <vector>
#include <string>
#include <iostream>
#include "util.h"
template<class T>
class NodeTree {
public:
NodeTree();
NodeTree(std::string name, T inData);
~NodeTree();
bool const operator==(NodeTree &other);
bool const operator<(const NodeTree &other) const;
void setParent(NodeTree<T>* parent);
void addParent(NodeTree<T>* parent);
NodeTree<T>* getParent();
std::vector<NodeTree<T>*> getParents();
void addChild(NodeTree<T>* child);
void insertChild(int i, NodeTree<T>* child);
void addChildren(std::vector<NodeTree<T>*>* children);
void addChildren(std::vector<NodeTree<T>*> children);
void insertChildren(int index, std::vector<NodeTree<T>*>* children);
void insertChildren(int index, std::vector<NodeTree<T>*> children);
int findChild(NodeTree<T>* child);
void removeChild(NodeTree<T>* child);
void removeChild(int index);
void clearChildren();
std::vector<NodeTree<T>*> getChildren();
NodeTree<T>* get(int index);
std::string getName();
void setName(std::string);
T getData() const;
T* getDataRef();
void setData(T data);
int size();
std::string DOTGraphString();
private:
std::string DOTGraphStringHelper(std::vector<NodeTree<T>*> avoidList);
std::string getDOTName();
std::string name;
T data;
std::vector<NodeTree<T>*> parents;
std::vector<NodeTree<T>*> children;
static int idCounter;
int id;
};
template<class T>
int NodeTree<T>::idCounter;
template<class T>
NodeTree<T>::NodeTree() {
name = "UnnamedNode";
id = idCounter++;
}
template<class T>
NodeTree<T>::NodeTree(std::string name, T inData) {
this->name = name;
this->data = inData;
id = idCounter++;
}
template<class T>
NodeTree<T>::~NodeTree() {
children.clear();
parents.clear(); //? Will this segfault?
}
template<class T>
const bool NodeTree<T>::operator==(NodeTree &other) {
if (!(data == other.data))
return false;
if (children.size() != other.getChildren().size())
return false;
for (typename std::vector<NodeTree<T>*>::size_type i = 0; i < children.size(); i++)
if (! (*(children[i]) == *(other.getChildren()[i])))
return false;
return true;
}
//Used when making a map of NodeTrees
template<class T>
const bool NodeTree<T>::operator<(const NodeTree &other) const {
return data < other.getData();
}
template<class T>
void NodeTree<T>::setParent(NodeTree<T>* parent) {
parents.clear();
parents.push_back(parent);
}
template<class T>
void NodeTree<T>::addParent(NodeTree<T>* parent) {
parents.push_back(parent);
}
template<class T>
NodeTree<T>* NodeTree<T>::getParent() {
if (parents.size() > 0)
return parents[0];
return NULL;
}
template<class T>
std::vector<NodeTree<T>*> NodeTree<T>::getParents() {
return parents;
}
template<class T>
void NodeTree<T>::addChild(NodeTree<T>* child) {
if (!child)
throw "Help, NULL child";
//if (findChild(child) == -1)
children.push_back(child);
}
template<class T>
void NodeTree<T>::insertChild(int i, NodeTree<T>* child) {
if (!child)
throw "Help, NULL child";
//if (findChild(child) == -1)
children.insert(children.begin()+i,child);
}
template<class T>
void NodeTree<T>::addChildren(std::vector<NodeTree<T>*>* children) {
for (typename std::vector<NodeTree<T>*>::size_type i = 0; i < children->size(); i++)
addChild((*children)[i]);
}
template<class T>
void NodeTree<T>::addChildren(std::vector<NodeTree<T>*> children) {
for (typename std::vector<NodeTree<T>*>::size_type i = 0; i < children.size(); i++)
addChild(children[i]);
}
template<class T>
void NodeTree<T>::insertChildren(int index, std::vector<NodeTree<T>*>* children) {
for (typename std::vector<NodeTree<T>*>::size_type i = 0; i < children->size(); i++)
insertChild(index+i,(*children)[i]);
}
template<class T>
void NodeTree<T>::insertChildren(int index, std::vector<NodeTree<T>*> children) {
for (typename std::vector<NodeTree<T>*>::size_type i = 0; i < children.size(); i++)
insertChild(index+i, children[i]);
}
template<class T>
int NodeTree<T>::findChild(NodeTree<T>* child) {
for (int i = 0; i < children.size(); i++) {
if (children[i] == child) {
return i;
}
}
return -1;
}
template<class T>
void NodeTree<T>::removeChild(int index) {
children[index] = NULL;
children.erase(children.begin()+index);
}
template<class T>
void NodeTree<T>::removeChild(NodeTree<T>* child) {
int index = findChild(child);
if (index != -1) {
removeChild(index);
}
}
template<class T>
void NodeTree<T>::clearChildren() {
for (typename std::vector<T>::size_type i = 0; i < children.size(); i++)
children[i] = NULL;
children.clear();
}
template<class T>
std::vector<NodeTree<T>*> NodeTree<T>::getChildren() {
return children;
}
template<class T>
int NodeTree<T>::size() {
int count = 0;
for (int i = 0; i < children.size(); i++) {
count += children[i]->size();
}
return 1+count;
}
template<class T>
NodeTree<T>* NodeTree<T>::get(int index) {
return children[index];
}
template<class T>
std::string NodeTree<T>::getName() {
return name;
}
template<class T>
void NodeTree<T>::setName(std::string name) {
this->name = name;
}
template<class T>
T NodeTree<T>::getData() const {
return data;
}
template<class T>
T* NodeTree<T>::getDataRef() {
return &data;
}
template<class T>
void NodeTree<T>::setData(T data) {
this->data = data;
}
template<class T>
std::string NodeTree<T>::DOTGraphString() {
return( "digraph Kraken { \n" + DOTGraphStringHelper(std::vector<NodeTree<T>*>()) + "}");
}
template<class T>
std::string NodeTree<T>::DOTGraphStringHelper(std::vector<NodeTree<T>*> avoidList) {
for (typename std::vector<NodeTree<T>*>::size_type i = 0; i < avoidList.size(); i++)
if (this == avoidList[i])
return "";
avoidList.push_back(this);
std::string ourDOTRelation = "";
for (int i = 0; i < children.size(); i++) {
if (children[i] != NULL)
ourDOTRelation += getDOTName() + " -> " + children[i]->getDOTName() + ";\n" + children[i]->DOTGraphStringHelper(avoidList);
else
ourDOTRelation += getDOTName() + " -> BAD_NULL_" + getDOTName() + "\n";
}
return(ourDOTRelation);
}
template<class T>
std::string NodeTree<T>::getDOTName() {
std::string DOTName = "";
DOTName = "\"" + replaceExEscape(name + "-" + data.toString(), "\"", "\\\"") + "_" + intToString(id) + "\""; //Note that terminals already have a quote in the front of their name, so we don't need to add one
// if (data != NULL)
// DOTName = "\"" + replaceExEscape(name + "-" + data->toString(), "\"", "\\\"") + "_" + intToString(id) + "\""; //Note that terminals already have a quote in the front of their name, so we don't need to add one
// else
// DOTName = "\"" + replaceExEscape(name, "\"", " \\\"") + "_" + intToString(id) + "\"";
return(replaceExEscape(DOTName, "\n", "\\n"));
}
#endif

View File

@@ -1,36 +0,0 @@
#ifndef PARSE_ACTION_H
#define PARSE_ACTION_H
#ifndef NULL
#define NULL ((void*)0)
#endif
#include "util.h"
#include "ParseRule.h"
#include <vector>
#include <string>
class ParseAction {
public:
enum ActionType { INVALID, REDUCE, SHIFT, ACCEPT, REJECT };
ParseAction(ActionType action);
ParseAction(ActionType action, ParseRule* reduceRule);
ParseAction(ActionType action, int shiftState);
~ParseAction();
bool const equalsExceptLookahead(const ParseAction &other) const;
bool const operator==(const ParseAction &other) const;
bool const operator!=(const ParseAction &other) const;
bool const operator<(const ParseAction &other) const;
std::string toString(bool printRuleLookahead = true);
static std::string actionToString(ActionType action);
ActionType action;
ParseRule* reduceRule;
int shiftState;
};
#endif

View File

@@ -1,53 +0,0 @@
#ifndef PARSERULE_H
#define PARSERULE_H
#ifndef NULL
#define NULL ((void*)0)
#endif
#include "Symbol.h"
#include <vector>
#include <string>
#include <iostream>
class ParseRule {
private:
int pointerIndex;
Symbol leftHandle;
std::vector<Symbol> lookahead;
std::vector<Symbol> rightSide;
public:
ParseRule();
ParseRule(Symbol leftHandle, int pointerIndex, std::vector<Symbol> &rightSide, std::vector<Symbol> lookahead);
~ParseRule();
const bool equalsExceptLookahead(const ParseRule &other) const;
bool const operator==(const ParseRule &other) const;
bool const operator!=(const ParseRule &other) const;
bool const operator<(const ParseRule &other) const; //Used for ordering so we can put ParseRule's in sets, and also so that ParseActions will have an ordering
ParseRule* clone();
void setLeftHandle(Symbol leftHandle);
void appendToRight(Symbol appendee);
Symbol getLeftSide();
void setRightSide(std::vector<Symbol> rightSide);
std::vector<Symbol> getRightSide();
Symbol getAtNextIndex();
Symbol getAtIndex();
int getRightSize();
int getIndex();
bool advancePointer();
bool isAtEnd();
void setLookahead(std::vector<Symbol> lookahead);
void addLookahead(std::vector<Symbol> lookahead);
std::vector<Symbol> getLookahead();
std::string toString(bool printLookahead = true);
std::string toDOT();
};
#endif

View File

@@ -1,73 +0,0 @@
#ifndef PARSER_H
#define PARSER_H
#include "util.h"
#include "ParseRule.h"
#include "ParseAction.h"
#include "Symbol.h"
#include "State.h"
#include "StringReader.h"
#include "Lexer.h"
#include "NodeTree.h"
#include "Table.h"
#include <queue>
#include <set>
#include <map>
#include <vector>
#include <algorithm>
#include <stack>
#include <string>
#include <iostream>
class Parser {
public:
Parser();
~Parser();
virtual void loadGrammer(std::string grammerInputString);
virtual void createStateSet();
virtual std::string stateSetToString();
virtual NodeTree<Symbol>* parseInput(std::string inputString, std::string filename, bool highlight_errors) = 0; // filename for error reporting
virtual std::string grammerToString();
virtual std::string grammerToDOT();
std::string tableToString();
void exportTable(std::ofstream &file);
void importTable(char* tableData);
protected:
std::vector<Symbol> firstSet(Symbol token, std::vector<Symbol> avoidList = std::vector<Symbol>(), bool addNewTokens = true);
bool isNullable(Symbol token);
bool isNullableHelper(Symbol token, std::set<Symbol> done);
std::map<Symbol, std::vector<Symbol>> tokenFirstSet;
std::map<Symbol, bool> tokenNullable;
std::vector<Symbol> incrementiveFollowSet(ParseRule* rule);
virtual void closure(State* state);
virtual void addStates(std::vector< State* >* stateSets, State* state, std::queue<State*>* toDo);
int stateNum(State* state);
StringReader reader;
Lexer lexer;
std::map<std::pair<std::string, bool>, Symbol> symbols;
std::vector<ParseRule*> loadedGrammer;
std::vector< State* > stateSets;
Symbol EOFSymbol;
Symbol nullSymbol;
Symbol invalidSymbol;
Table table;
std::stack<int> stateStack;
std::stack<Symbol> symbolStack;
Symbol getOrAddSymbol(std::string symbolString, bool isTerminal);
};
#endif

View File

@@ -1,126 +0,0 @@
#ifndef POSET_H
#define POSET_H
#include <vector>
#include <set>
#include <map>
#include <queue>
#include <cassert>
#include "util.h"
template <class T>
class Poset {
public:
Poset();
~Poset();
void addRelationship(T first, T second);
void addVertex(T vertex);
bool zeroDependencies(T vertex);
std::set<T> getDependsOn(T dependency);
std::vector<T> getTopoSort();
static void test();
private:
//backing data structures
std::map<T, std::map<T,bool>> adjMatrix;
std::set<T> verticies;
};
template <class T>
Poset<T>::Poset() {
//Nothing needed
}
template <class T>
Poset<T>::~Poset() {
//Ditto
}
template <class T>
void Poset<T>::addRelationship(T first, T second) {
verticies.insert(first);
verticies.insert(second);
adjMatrix[first][second] = true;
}
template <class T>
void Poset<T>::addVertex(T vertex) {
verticies.insert(vertex);
}
template <class T>
bool Poset<T>::zeroDependencies(T vertex) {
auto depMapItr = adjMatrix.find(vertex);
if (depMapItr == adjMatrix.end())
return true;
for (auto i : depMapItr->second)
if (i.second == true)
return false;
return true;
}
template <class T>
std::set<T> Poset<T>::getDependsOn(T dependency) {
std::set<T> vertsThatDependOn;
for (auto i : adjMatrix) {
auto depItr = i.second.find(dependency);
if (depItr != i.second.end() && depItr->second)
vertsThatDependOn.insert(i.first);
}
return vertsThatDependOn;
}
template <class T>
std::vector<T> Poset<T>::getTopoSort() {
std::vector<T> sorted;
std::queue<T> toDo;
for (auto i : verticies)
if (zeroDependencies(i))
toDo.push(i);
while(!toDo.empty()) {
T current = toDo.front(); toDo.pop();
sorted.push_back(current);
for (T depOnCurrent : getDependsOn(current)) {
adjMatrix[depOnCurrent][current] = false; //Remove the edge to current, since current's now been taken care of
if (zeroDependencies(depOnCurrent))
toDo.push(depOnCurrent);
}
}
return sorted;
}
//would make it just an int specilization, but then we get multiple definition complaints....
template<class T>
void Poset<T>::test() {
std::string result;
{
Poset<int> poset;
poset.addVertex(1000);
for (int i = 0; i < 20; i++)
poset.addRelationship(i,i+1);
result = "";
for (int i : poset.getTopoSort())
result += intToString(i) + " ";
//std::cout << result << std::endl;
assert(result == "20 1000 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 "); //Note that sets do not have a set order, so this could change
//This is why the 1000 is in an odd, yet valid, position
}
{
Poset<int> poset;
for (int i = 0; i < 20; i+=2)
poset.addRelationship(i,i+1);
result = "";
for (int i : poset.getTopoSort())
result += intToString(i) + " ";
//std::cout << result << std::endl;
assert(result == "1 3 5 7 9 11 13 15 17 19 0 2 4 6 8 10 12 14 16 18 ");
}
std::cout << "Poset tests passed" << std::endl;
}
#endif

View File

@@ -1,68 +0,0 @@
#ifndef RNGLRPARSER_H
#define RNGLRPARSER_H
#include <iostream>
#include <queue>
#include <map>
#include <vector>
#include <set>
#include <utility>
#include <algorithm>
#include "Parser.h"
#include "Symbol.h"
#include "GraphStructuredStack.h"
#include "util.h"
class RNGLRParser: public Parser {
public:
RNGLRParser();
~RNGLRParser();
NodeTree<Symbol>* parseInput(std::string inputString, std::string filename, bool highlight_errors); // filename for error reporting
void printReconstructedFrontier(int frontier);
private:
void reducer(int i);
void shifter(int i);
void addChildren(NodeTree<Symbol>* parent, std::vector<NodeTree<Symbol>*>* children, NodeTree<Symbol>* nullableParts);
void addStates(std::vector< State* >* stateSets, State* state, std::queue<State*>* toDo);
void addStateReductionsToTable(State* state);
bool fullyReducesToNull(ParseRule* rule);
bool reducesToNull(ParseRule* rule);
bool reducesToNull(ParseRule* rule, std::vector<Symbol> avoidList);
bool belongsToFamily(NodeTree<Symbol>* node, std::vector<NodeTree<Symbol>*>* nodes);
bool arePacked(std::vector<NodeTree<Symbol>*> nodes);
bool isPacked(NodeTree<Symbol>* node);
void setPacked(NodeTree<Symbol>* node, bool isPacked);
NodeTree<Symbol>* getNullableParts(ParseRule* rule);
NodeTree<Symbol>* getNullableParts(ParseRule* rule, std::vector<NodeTree<Symbol>*> avoidList);
NodeTree<Symbol>* getNullableParts(Symbol symbol);
std::vector<NodeTree<Symbol>*> getPathEdges(std::vector<NodeTree<int>*> path);
int findLine(int tokenNum); //Get the line number for a token, used for error reporting
std::vector<Symbol> input;
GraphStructuredStack gss;
//start node, lefthand side of the reduction, reduction length
struct Reduction {
NodeTree<int>* from;
Symbol symbol;
int length;
NodeTree<Symbol>* nullableParts;
NodeTree<Symbol>* label;
} ;
std::queue<Reduction> toReduce;
//Node coming from, state going to
std::queue< std::pair<NodeTree<int>*, int> > toShift;
std::vector<std::pair<NodeTree<Symbol>*, int> > SPPFStepNodes;
std::vector<NodeTree<Symbol>*> nullableParts;
std::map<NodeTree<Symbol>, bool> packedMap;
std::map<ParseRule*, bool> reduceToNullMap;
};
#endif

View File

@@ -1,29 +0,0 @@
#ifndef REGEX_H
#define REGEX_H
#include "util.h"
#include "RegExState.h"
#include "Symbol.h"
#include <string>
#include <utility>
#include <stack>
#include <vector>
class RegEx {
public:
RegEx();
RegEx(std::string inPattern);
~RegEx();
RegExState* construct(std::vector<RegExState*>* ending, std::string pattern);
int longMatch(std::string stringToMatch);
std::string getPattern();
std::string toString();
static void test();
private:
std::string pattern;
RegExState* begin;
std::vector<RegExState*> currentStates;
};
#endif

View File

@@ -1,32 +0,0 @@
#ifndef REGEXSTATE_H
#define REGEXSTATE_H
#include "util.h"
#include "Symbol.h"
#include <string>
#include <vector>
class RegExState {
public:
RegExState(char inCharacter);
RegExState();
~RegExState();
void addNext(RegExState* nextState);
bool characterIs(char inCharacter);
std::vector<RegExState*> advance(char advanceCharacter);
std::vector<RegExState*> getNextStates();
bool isGoal();
std::string toString();
std::string toString(RegExState* avoid);
std::string toString(std::vector<RegExState*>* avoid);
char getCharacter();
private:
std::vector<RegExState*> nextStates;
char character;
};
#endif

View File

@@ -1,50 +0,0 @@
#ifndef REMOVALTRANSFORMATION_H
#define REMOVALTRANSFORMATION_H
#include <queue>
#include <vector>
#include "NodeTransformation.h"
template<class T>
class RemovalTransformation: public NodeTransformation<T,T> {
public:
RemovalTransformation(T toRemove);
~RemovalTransformation();
virtual NodeTree<T>* transform(NodeTree<T>* from);
private:
T toRemove;
};
#endif
template<class T>
RemovalTransformation<T>::RemovalTransformation(T toRemove) {
this->toRemove = toRemove;
}
template<class T>
RemovalTransformation<T>::~RemovalTransformation() {
//
}
template<class T>
NodeTree<T>* RemovalTransformation<T>::transform(NodeTree<T>* from) {
std::queue<NodeTree<T>*> toProcess;
toProcess.push(from);
while(!toProcess.empty()) {
NodeTree<T>* node = toProcess.front();
toProcess.pop();
if (!node)
continue;
std::vector<NodeTree<T>*> children = node->getChildren();
for (int i = 0; i < children.size(); i++) {
if (children[i]->getData() == toRemove)
node->removeChild(children[i]);
else if (children[i])
toProcess.push(children[i]);
}
}
return from;
}

View File

@@ -1,46 +0,0 @@
#ifndef STATE_H
#define STATE_H
#ifndef NULL
#define NULL ((void*)0)
#endif
#include "util.h"
#include "ParseRule.h"
#include <vector>
#include <string>
#include <string>
#include <sstream>
class State {
public:
State(int number, ParseRule* basis);
State(int number, ParseRule* basis, State* parent);
~State();
bool const operator==(const State &other);
bool const basisEquals(const State &other);
bool const basisEqualsExceptLookahead(const State &other);
bool const operator!=(const State &other);
std::vector<ParseRule*>* getBasis();
std::vector<ParseRule*>* getRemaining();
std::vector<ParseRule*> getTotal();
bool containsRule(ParseRule* rule);
void addRuleCombineLookahead(ParseRule* rule);
std::string toString();
void combineStates(State &other);
void addParents(std::vector<State*>* parents);
std::vector<State*>* getParents();
std::vector<State*>* getDeepParents(int depth);
int getNumber();
std::vector<ParseRule*> basis;
std::vector<ParseRule*> remaining;
private:
std::vector<State*> parents;
int number;
};
#endif

View File

@@ -1,28 +0,0 @@
#ifndef StringReader_H
#define StringReader_H
#include <vector>
#include <string>
#include <iostream>
class StringReader
{
public:
StringReader();
StringReader(std::string inputString);
virtual ~StringReader();
void setString(std::string inputString);
std::string word(bool truncateEnd = true);
std::string line(bool truncateEnd = true);
std::string getTokens(const char *get_chars, bool truncateEnd = true);
std::string truncateEnd(std::string to_truncate);
static void test();
protected:
private:
std::string rd_string;
int str_pos;
bool end_reached;
};
#endif

View File

@@ -1,37 +0,0 @@
#ifndef SYMBOL_H
#define SYMBOL_H
#ifndef NULL
#define NULL ((void*)0)
#endif
#include "NodeTree.h"
#include <vector>
#include <string>
class Symbol {
public:
Symbol();
Symbol(std::string name, bool isTerminal);
Symbol(std::string name, bool isTerminal, std::string value);
Symbol(std::string name, bool isTerminal, NodeTree<Symbol>* tree);
~Symbol();
bool const operator==(const Symbol &other)const;
bool const operator!=(const Symbol &other)const;
bool const operator<(const Symbol &other)const;
std::string getName() const;
std::string getValue() const;
std::string toString() const;
Symbol clone();
void setSubTree(NodeTree<Symbol>* tree);
NodeTree<Symbol>* getSubTree();
bool isTerminal();
private:
std::string name;
std::string value;
bool terminal;
};
#endif

View File

@@ -1,37 +0,0 @@
#include <fstream>
#include <string>
#include <utility>
#include "util.h"
#include "ParseRule.h"
#include "ParseAction.h"
#include "Symbol.h"
#include "State.h"
#ifndef TABLE_H
#define TABLE_H
class Table {
public:
Table();
~Table();
void exportTable(std::ofstream &file);
void importTable(char* tableData);
void setSymbols(Symbol EOFSymbol, Symbol nullSymbol);
void add(int stateNum, Symbol tranSymbol, ParseAction* action);
void remove(int stateNum, Symbol tranSymbol);
std::vector<ParseAction*>* get(int state, Symbol token);
ParseAction* getShift(int state, Symbol token);
std::vector<std::pair<std::string, ParseAction>> stateAsParseActionVector(int state);
std::string toString();
private:
std::vector< std::vector< std::vector<ParseAction*>* >* > table;
std::vector<Symbol> symbolIndexVec;
//The EOFSymbol, a pointer because of use in table, etc
Symbol EOFSymbol;
//The nullSymbol, ditto with above. Also used in comparisons
Symbol nullSymbol;
};
#endif

View File

@@ -1,32 +0,0 @@
#include <iostream>
#include <string>
#include <stdlib.h>
#include "util.h"
#ifndef TESTER_H
#define TESTER_H
class Tester {
public:
Tester(std::string krakenInvocation, std::string krakenGrammerLocation);
~Tester();
bool run(std::string fileName);
bool compareFiles(std::string file1Path, std::string file2Path);
void cleanExtras(std::string path);
private:
std::string krakenInvocation;
std::string krakenGrammerLocation;
std::string removeCmd;
std::string resultsExtention;
std::string expectedExtention;
std::string krakenExtention;
std::string shell;
std::string changePermissions;
std::string redirect;
std::string sep;
std::string cd;
};
#endif

View File

@@ -1,64 +0,0 @@
#ifndef TYPE_H
#define TYPE_H
#ifndef NULL
#define NULL ((void*)0)
#endif
#include <string>
#include <iostream>
#include <set>
//Circular dependency
class ASTData;
#include "ASTData.h"
#include "util.h"
enum ValueType {none, template_type, template_type_type, void_type, boolean, character, integer, floating, double_percision, function_type };
class Type {
public:
Type();
Type(ValueType typeIn, int indirectionIn = 0);
Type(ValueType typeIn, std::set<std::string> traitsIn); //Mostly for template type type's
Type(NodeTree<ASTData>* typeDefinitionIn, int indirectionIn = 0);
Type(NodeTree<ASTData>* typeDefinitionIn, std::set<std::string> traitsIn);
Type(ValueType typeIn, NodeTree<ASTData>* typeDefinitionIn, int indirectionIn, bool referenceIn, std::set<std::string> traitsIn);
Type(ValueType typeIn, NodeTree<ASTData>* typeDefinitionIn, int indirectionIn, bool referenceIn, std::set<std::string> traitsIn, std::vector<Type*> parameterTypesIn, Type* returnTypeIn);
Type(std::vector<Type*> parameterTypesIn, Type* returnTypeIn, bool referenceIn = false);
Type(ValueType typeIn, NodeTree<Symbol>* templateDefinitionIn, std::set<std::string> traitsIn = std::set<std::string>());
~Type();
const bool test_equality(const Type &other, bool care_about_references) const;
bool const operator==(const Type &other)const;
bool const operator!=(const Type &other)const;
bool const operator<(const Type &other)const;
Type* clone();
std::string toString(bool showTraits = true);
int getIndirection();
void setIndirection(int indirectionIn);
void increaseIndirection();
void decreaseIndirection();
void modifyIndirection(int mod);
Type withIncreasedIndirection();
Type withReference();
Type *withReferencePtr();
Type *withIncreasedIndirectionPtr();
Type withDecreasedIndirection();
Type* withoutReference();
ValueType baseType;
NodeTree<ASTData>* typeDefinition;
NodeTree<Symbol>* templateDefinition;
std::map<std::string, Type*> templateTypeReplacement;
bool templateInstantiated;
std::set<std::string> traits;
std::vector<Type*> parameterTypes;
Type *returnType;
bool is_reference;
private:
int indirection;
};
#endif

View File

@@ -1,92 +0,0 @@
#ifndef UTIL_H
#define UTIL_H
#ifndef NULL
#define NULL ((void*)0)
#endif
#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include <set>
#include <fstream>
#include <cstring>
int ssystem(std::string command);
std::string intToString(int theInt);
std::string replaceExEscape(std::string first, std::string search, std::string replace);
std::string strSlice(std::string str, int begin, int end);
int findPerenEnd(std::string str, int i);
std::vector<std::string> split(const std::string &str, char delim);
std::string join(const std::vector<std::string> &strVec, std::string joinStr);
std::string readFile(std::istream &file);
std::string padWithSpaces(std::string str, int padTo);
template <typename T, typename U, typename V>
class triple {
public:
T first;
U second;
V third;
};
template <typename T, typename U, typename V>
triple<T,U,V> make_triple(T f, U s, V t) {
triple<T,U,V> out;
out.first = f;
out.second = s;
out.third = t;
return out;
}
template <typename T>
bool contains(std::vector<T> vec, T item) {
for (auto i : vec)
if (i == item)
return true;
return false;
}
template <typename T>
std::vector<T> flatten(std::vector<std::vector<T>> vec) {
std::vector<T> flat;
for (auto i : vec)
flat.insert(flat.end(), i.begin(), i.end());
return flat;
}
template <typename T>
std::vector<T> reverse(std::vector<T> vec) {
std::vector<T> flat;
flat.insert(flat.end(), vec.rbegin(), vec.rend());
return flat;
}
template <typename T>
std::vector<T> dereferenced(std::vector<T*> vec) {
std::vector<T> de;
for (T* i:vec)
de.push_back(*i);
return de;
}
template <typename T>
std::vector<T> slice(std::vector<T> vec, int begin, int end, int step = 1) {
std::vector<T> toReturn;
if (begin < 0)
begin += vec.size()+1;
if (end < 0)
end += vec.size()+1;
for (int i = begin; i < end; i += step)
toReturn.push_back(vec[i]);
return toReturn;
}
template <typename T>
bool subset(std::set<T> a, std::set<T> b) {
for (auto i : a)
if (b.find(i) == b.end())
return false;
return true;
}
#endif

View File

@@ -1,190 +0,0 @@
#include <string>
#include <iostream>
#include <fstream>
#include <vector>
#include <cstring>
#include "NodeTree.h"
#include "Symbol.h"
#include "Lexer.h"
#include "RNGLRParser.h"
#include "Importer.h"
#include "ASTData.h"
#include "CGenerator.h"
#include "Poset.h"
#include "util.h"
#include "Tester.h"
int main(int argc, char* argv[]) {
std::vector<std::string> includePaths;
includePaths.push_back(""); //Local
if (argc <= 1) {
std::cerr << "Kraken invocation: kraken sourceFile.krak" << std::endl;
std::cerr << "Kraken invocation: kraken sourceFile.krak outputName" << std::endl;
std::cerr << "Kraken invocation: kraken grammerFile.kgm sourceFile.krak outputName" << std::endl;
std::cerr << "Or for testing do: kraken --test [optional list of names of file (.krak .expected_results) without extentions to run]" << std::endl;
return 0;
}
std::string grammerFileString = "../krakenGrammer.kgm";
if (argc >= 2 && std::string(argv[1]) == "--test") {
StringReader::test();
RegEx::test();
Lexer::test();
//std::cout << strSlice("123", 0, -1) << std::endl;
Poset<int>::test();
if (argc >= 3) {
std::string testResults, line;
int passed = 0, failed = 0;
Tester test(argv[0], grammerFileString);
// find the max length so we can pad the string and align the results
unsigned int maxLineLength = 0;
for (int i = 2; i < argc; i++) {
int strLen = std::string(argv[i]).length();
maxLineLength = maxLineLength < strLen ? strLen : maxLineLength;
}
for (int i = 2; i < argc; i++) {
bool result = test.run(argv[i]);
if (result)
line = padWithSpaces(std::string(argv[i]), maxLineLength) + "\t\tpassed!\n", passed++;
else
line = padWithSpaces(std::string(argv[i]), maxLineLength) + "\t\tFAILED!!!!\n", failed++;
std::cout << line << std::endl;
testResults += line;
}
std::cout << "===========Done Testing===========" << std::endl;
std::cout << testResults << std::endl;
std::cout << "Test results: " << passed << "/" << passed+failed << std::endl;
}
return 0;
}
std::string krakenDir = argv[0];
krakenDir = strSlice(krakenDir, 0, -(std::string("kraken").length()+1));
includePaths.push_back(krakenDir + "stdlib/"); //Add the stdlib directory that exists in the same directory as the kraken executable to the path
std::string programName;
std::string outputName;
bool parse_only = false;
//std::cout << "argv[1] == " << argv[1] << std::endl;
if (std::string(argv[1]) == "--parse-only") {
parse_only = true;
grammerFileString = argv[2];
programName = argv[3];
//outputName = argv[3];
} else if (argc > 3) {
grammerFileString = argv[1];
programName = argv[2];
outputName = argv[3];
} else if (argc == 3) {
programName = argv[1];
outputName = argv[2];
} else {
programName = argv[1];
outputName = join(slice(split(programName, '.'), 0, -2), "."); // without extension
}
std::ifstream grammerInFile, compiledGrammerInFile;
std::ofstream compiledGrammerOutFile;
grammerInFile.open(grammerFileString);
if (!grammerInFile.is_open()) {
std::cerr << "Problem opening grammerInFile " << grammerFileString << "\n";
return(1);
}
compiledGrammerInFile.open(grammerFileString + ".comp", std::ios::binary | std::ios::ate);
if (!compiledGrammerInFile.is_open())
std::cerr << "Problem opening compiledGrammerInFile " << grammerFileString + ".comp" << "\n";
//Read the input file into a string
std::string grammerInputFileString;
std::string line;
while(grammerInFile.good()) {
getline(grammerInFile, line);
grammerInputFileString.append(line+"\n");
}
grammerInFile.close();
RNGLRParser parser;
parser.loadGrammer(grammerInputFileString);
//Start binary stuff
bool compGramGood = false;
if (compiledGrammerInFile.is_open()) {
//std::cout << "Compiled grammer file exists, reading it in" << std::endl;
std::streampos compGramSize = compiledGrammerInFile.tellg();
char* binaryTablePointer = new char [compGramSize];
compiledGrammerInFile.seekg(0, std::ios::beg);
compiledGrammerInFile.read(binaryTablePointer, compGramSize);
compiledGrammerInFile.close();
//Check magic number
if (binaryTablePointer[0] == 'K' && binaryTablePointer[1] == 'R' && binaryTablePointer[2] == 'A' && binaryTablePointer[3] == 'K') {
//std::cout << "Valid Kraken Compiled Grammer File" << std::endl;
int gramStringLength = *((int*)(binaryTablePointer+4));
//std::cout << "The grammer string is stored to be " << gramStringLength << " characters long, gramString is "
//<< grammerInputFileString.length() << " long. Remember 1 extra for null terminator!" << std::endl;
if (grammerInputFileString.length() != gramStringLength-1 ||
(strncmp(grammerInputFileString.c_str(), (binaryTablePointer+4+sizeof(int)), gramStringLength) != 0)) {
//(one less for null terminator that is stored)
std::cout << "The Grammer has been changed, will re-create" << std::endl;
} else {
compGramGood = true;
//std::cout << "Grammer file is up to date." << std::endl;
parser.importTable(binaryTablePointer + 4 + sizeof(int) + gramStringLength); //Load table starting at the table section
}
} else {
std::cerr << grammerFileString << ".comp is NOT A Valid Kraken Compiled Grammer File, aborting" << std::endl;
return -1;
}
delete [] binaryTablePointer;
}
if (!compGramGood) {
//The load failed because either the file does not exist or it is not up-to-date.
std::cout << "Compiled grammer file does not exist or is not up-to-date, generating table and writing it out" << std::endl;
compiledGrammerOutFile.open(grammerFileString + ".comp", std::ios::binary);
if (!compiledGrammerOutFile.is_open())
std::cerr << "Could not open compiled file to write either!" << std::endl;
compiledGrammerOutFile.write("KRAK", sizeof(char)*4); //Let us know when we load it that this is a kraken grammer file, but don't write out
compiledGrammerOutFile.flush(); // the grammer txt until we create the set, so that if we fail creating it it won't look valid
parser.createStateSet();
int* intBuffer = new int;
*intBuffer = grammerInputFileString.length()+1;
compiledGrammerOutFile.write((char*)intBuffer, sizeof(int));
delete intBuffer;
compiledGrammerOutFile.write(grammerInputFileString.c_str(), grammerInputFileString.length()+1); //Don't forget null terminator
parser.exportTable(compiledGrammerOutFile);
compiledGrammerOutFile.close();
}
//End binary stuff
//std::cout << "\nParsing" << std::endl;
//std::cout << "\toutput name: " << outputName << std::endl;
//std::cout << "\tprogram name: " << programName << std::endl;
Importer importer(&parser, includePaths, outputName, parse_only); // Output name for directory to put stuff in
//for (auto i : includePaths)
//std::cout << i << std::endl;
importer.import(programName);
std::map<std::string, NodeTree<ASTData>*> ASTs = importer.getASTMap();
if (parse_only)
return 0;
//Do optimization, etc. here.
//None at this time, instead going straight to C in this first (more naive) version
//Code generation
//For right now, just C
// return code from calling C compiler
return CGenerator().generateCompSet(ASTs, outputName);
}

View File

@@ -1,88 +0,0 @@
#include "ASTData.h"
ASTData::ASTData() {
this->type = undef;
this->valueType = NULL;
}
ASTData::ASTData(ASTType type, Type *valueType) {
this->type = type;
this->valueType = valueType;
}
ASTData::ASTData(ASTType type, Symbol symbol, Type *valueType) {
this->type = type;
this->valueType = valueType;
this->symbol = symbol;
}
ASTData::~ASTData() {
}
std::string ASTData::toString() {
return ASTTypeToString(type) + " " +
(symbol.isTerminal() ? " " + symbol.toString() : "") + " " +
(valueType ? valueType->toString() : "no_type");
}
std::string ASTData::ASTTypeToString(ASTType type) {
switch (type) {
case translation_unit:
return "translation_unit";
case identifier:
return "identifier";
case import:
return "import";
case function:
return "function";
case type_def:
return "type_def";
case code_block:
return "code_block";
case typed_parameter:
return "typed_parameter";
case expression:
return "expression";
case boolean_expression:
return "boolean_expression";
case statement:
return "statement";
case if_statement:
return "if_statement";
case while_loop:
return "while_loop";
case for_loop:
return "for_loop";
case return_statement:
return "return_statement";
case break_statement:
return "break_statement";
case continue_statement:
return "continue_statement";
case defer_statement:
return "defer_statement";
case assignment_statement:
return "assignment_statement";
case declaration_statement:
return "declaration_statement";
case if_comp:
return "if_comp";
case simple_passthrough:
return "simple_passthrough";
case passthrough_params:
return "passthrough_params";
case in_passthrough_params:
return "out_passthrough_params";
case param_assign:
return "param_assign";
case opt_string:
return "opt_string";
case function_call:
return "function_call";
case value:
return "value";
default:
return "unknown_ASTType";
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,45 +0,0 @@
#include "CCodeTriple.h"
CCodeTriple::CCodeTriple(std::string pre, std::string val, std::string post) {
preValue = pre;
value = val;
postValue = post;
}
CCodeTriple::CCodeTriple(std::string val) {
value = val;
}
CCodeTriple::CCodeTriple(const char* val) {
value = val;
}
CCodeTriple::CCodeTriple() {
}
CCodeTriple::~CCodeTriple() {
}
std::string CCodeTriple::oneString(bool endValue) {
return preValue + value + (endValue ? ";" : "") + postValue;
}
CCodeTriple & CCodeTriple::operator=(const CCodeTriple &rhs) {
preValue = rhs.preValue;
value = rhs.value;
postValue = rhs.postValue;
return *this;
}
CCodeTriple & CCodeTriple::operator+=(const CCodeTriple &rhs) {
preValue += rhs.preValue;
//preValue = rhs.preValue + preValue;
value += rhs.value;
postValue = rhs.postValue + postValue;
return *this;
}
CCodeTriple operator+(const CCodeTriple &a, const CCodeTriple &b) {
return CCodeTriple(a.preValue + b.preValue, a.value + b.value, b.postValue + a.postValue);
//return CCodeTriple(b.preValue + a.preValue, a.value + b.value, b.postValue + a.postValue);
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,145 +0,0 @@
#include "GraphStructuredStack.h"
GraphStructuredStack::GraphStructuredStack() {
//
}
GraphStructuredStack::~GraphStructuredStack() {
//
}
NodeTree<int>* GraphStructuredStack::newNode(int stateNum) {
return new NodeTree<int>("gssNode", stateNum);
}
void GraphStructuredStack::addToFrontier(int frontier, NodeTree<int>* node) {
//First, make sure our vector has this and lesser frontiers. If not, add it and up to it
while (gss.size() <= frontier) {
gss.push_back(new std::vector<NodeTree<int>*>());
}
gss[frontier]->push_back(node);
containing_frontier_map[node] = frontier;
}
NodeTree<int>* GraphStructuredStack::inFrontier(int frontier, int state) {
if (frontierIsEmpty(frontier))
return NULL;
for (std::vector<NodeTree<int>*>::size_type i = 0; i < gss[frontier]->size(); i++) {
if ((*(gss[frontier]))[i]->getData() == state)
return (*(gss[frontier]))[i];
}
return NULL;
}
int GraphStructuredStack::getContainingFrontier(NodeTree<int>* node) {
auto iter = containing_frontier_map.find(node);
if (iter != containing_frontier_map.end())
return iter->second;
return -1;
//for (std::vector<std::vector<NodeTree<int>*>*>::size_type i = 0; i < gss.size(); i++) {
//if (frontierIsEmpty(i))
//continue;
//for (std::vector<NodeTree<int>*>::size_type j = 0; j < gss[i]->size(); j++) {
//if ((*(gss[i]))[j] == node)
//return i;
//}
//}
//return -1;
}
bool GraphStructuredStack::frontierIsEmpty(int frontier) {
return frontier >= gss.size() || gss[frontier]->size() == 0;
}
NodeTree<int>* GraphStructuredStack::frontierGetAccState(int frontier) {
//The acc state is always state 1, for now
return inFrontier(frontier, 1);
}
std::vector<NodeTree<int>*>* GraphStructuredStack::getReachable(NodeTree<int>* start, int length) {
std::vector<NodeTree<int>*>* reachableList = new std::vector<NodeTree<int>*>();
std::queue<NodeTree<int>*> currentNodes;
std::queue<NodeTree<int>*> nextNodes;
currentNodes.push(start);
for (int i = 0; i < length; i++) {
while (!currentNodes.empty()) {
NodeTree<int>* currentNode = currentNodes.front();
currentNodes.pop();
std::vector<NodeTree<int>*> children = currentNode->getChildren();
//std::cout << currentNode->getData() << " has children ";
for (std::vector<NodeTree<int>*>::size_type j = 0; j < children.size(); j++) {
std::cout << children[j]->getData() << " ";
nextNodes.push(children[j]);
}
std::cout << std::endl;
}
currentNodes = nextNodes;
//No clear function, so go through and remove
while(!nextNodes.empty())
nextNodes.pop();
}
while (!currentNodes.empty()) {
reachableList->push_back(currentNodes.front());
//std::cout << currentNodes.front()->getData() << " is reachable from " << start->getData() << " by length " << length << std::endl;
currentNodes.pop();
}
return reachableList;
}
std::vector<std::vector<NodeTree<int>*> >* GraphStructuredStack::getReachablePaths(NodeTree<int>* start, int length) {
std::vector<std::vector<NodeTree<int>*> >* paths = new std::vector<std::vector<NodeTree<int>*> >();
std::vector<NodeTree<int>*> currentPath;
recursivePathFind(start, length, currentPath, paths);
return paths;
}
void GraphStructuredStack::recursivePathFind(NodeTree<int>* start, int length, std::vector<NodeTree<int>*> currentPath, std::vector<std::vector<NodeTree<int>*> >* paths) {
currentPath.push_back(start);
if (length == 0) {
paths->push_back(currentPath);
return;
}
std::vector<NodeTree<int>*> children = start->getChildren();
for (std::vector<NodeTree<int>*>::size_type i = 0; i < children.size(); i++) {
recursivePathFind(children[i], length-1, currentPath, paths);
}
}
bool GraphStructuredStack::hasEdge(NodeTree<int>* start, NodeTree<int>* end) {
//Really, either testing for parent or child should work.
return start->findChild(end) != -1;
}
NodeTree<Symbol>* GraphStructuredStack::getEdge(NodeTree<int>* start, NodeTree<int>* end) {
return edges[std::make_pair(start, end)];
}
void GraphStructuredStack::addEdge(NodeTree<int>* start, NodeTree<int>* end, NodeTree<Symbol>* edge) {
start->addChild(end);
end->addParent(start);
edges[std::make_pair(start, end)] = edge;
}
std::vector<int> GraphStructuredStack::getFrontier(int frontier) {
std::vector<int> toReturn;
for (int i = 0; i < gss[frontier]->size(); i++)
toReturn.push_back((*(gss[frontier]))[i]->getData());
return toReturn;
}
std::string GraphStructuredStack::toString() {
std::string tostring = "";
for (std::vector<std::vector<NodeTree<int>*>*>::size_type i = 0; i < gss.size(); i++) {
tostring += "Frontier: " + intToString(i) + "\n";
for (std::vector<NodeTree<int>*>::size_type j = 0; j < gss[i]->size(); j++) {
tostring += "|" + intToString((*(gss[i]))[j]->getData()) + "| ";
}
tostring += "\n";
}
return tostring;
}
void GraphStructuredStack::clear() {
gss.clear();
edges.clear();
}

View File

@@ -1,238 +0,0 @@
#include "Importer.h"
#ifdef _WIN32
#include <unistd.h>
#define mkdir( A, B ) mkdir(A)
#endif
Importer::Importer(Parser* parserIn, std::vector<std::string> includePaths, std::string outputNameIn, bool only_parseIn) {
only_parse = only_parseIn;
//constructor
outputName = outputNameIn;
if (!only_parse) {
if (mkdir(("./" + outputName).c_str(), 0755)) {
//std::cerr << "\n\n =====IMPORTER===== \n\n" << std::endl;
//std::cerr << "Could not make directory " << outputName << std::endl;
}
}
parser = parserIn;
this->includePaths = includePaths;
ASTTransformer = new ASTTransformation(this);
removeSymbols.push_back(Symbol("$NULL$", true));
removeSymbols.push_back(Symbol("WS", false));
removeSymbols.push_back(Symbol("\\(", true));
removeSymbols.push_back(Symbol("\\)", true));
removeSymbols.push_back(Symbol("var", true));
removeSymbols.push_back(Symbol("fun", true));
removeSymbols.push_back(Symbol(";", true));
removeSymbols.push_back(Symbol("line_end", false));
removeSymbols.push_back(Symbol("{", true));
removeSymbols.push_back(Symbol("}", true));
removeSymbols.push_back(Symbol("(", true));
removeSymbols.push_back(Symbol(")", true));
//removeSymbols.push_back(Symbol("import", true));
removeSymbols.push_back(Symbol("if", true));
removeSymbols.push_back(Symbol("while", true));
removeSymbols.push_back(Symbol("__if_comp__", true));
//removeSymbols.push_back(Symbol("simple_passthrough", true));
removeSymbols.push_back(Symbol("comp_simple_passthrough", true));
removeSymbols.push_back(Symbol("def_nonterm", false));
removeSymbols.push_back(Symbol("obj_nonterm", false));
removeSymbols.push_back(Symbol("adt_nonterm", false));
removeSymbols.push_back(Symbol("template", true));
removeSymbols.push_back(Symbol("\\|", true));
//removeSymbols.push_back(Symbol("match", true));
collapseSymbols.push_back(Symbol("case_statement_list", false));
collapseSymbols.push_back(Symbol("opt_param_assign_list", false));
collapseSymbols.push_back(Symbol("param_assign_list", false));
collapseSymbols.push_back(Symbol("opt_typed_parameter_list", false));
collapseSymbols.push_back(Symbol("opt_parameter_list", false));
collapseSymbols.push_back(Symbol("identifier_list", false));
collapseSymbols.push_back(Symbol("adt_option_list", false));
collapseSymbols.push_back(Symbol("statement_list", false));
collapseSymbols.push_back(Symbol("parameter_list", false));
collapseSymbols.push_back(Symbol("typed_parameter_list", false));
collapseSymbols.push_back(Symbol("unorderd_list_part", false));
collapseSymbols.push_back(Symbol("if_comp_pred", false));
collapseSymbols.push_back(Symbol("declaration_block", false));
collapseSymbols.push_back(Symbol("type_list", false));
collapseSymbols.push_back(Symbol("opt_type_list", false));
collapseSymbols.push_back(Symbol("template_param_list", false));
collapseSymbols.push_back(Symbol("trait_list", false));
collapseSymbols.push_back(Symbol("dec_type", false));
//collapseSymbols.push_back(Symbol("pre_reffed", false));
}
Importer::~Importer() {
//destructor
delete ASTTransformer;
}
void Importer::registerAST(std::string name, NodeTree<ASTData>* ast, NodeTree<Symbol>* syntaxTree) {
imported[name] = ast;
importedTrips.push_back({name, ast, syntaxTree});
std::cout << "REGISTERD " << name << std::endl;
}
NodeTree<ASTData>* Importer::getUnit(std::string fileName) {
//std::cout << "\n\nImporting " << fileName << " ";
//Check to see if we've already done it
if (imported.find(fileName) != imported.end()) {
//std::cout << "Already Imported!" << std::endl;
return imported[fileName];
}
//std::cout << "Not yet imported" << std::endl;
return NULL;
}
NodeTree<ASTData>* Importer::importFirstPass(std::string fileName) {
NodeTree<ASTData>* ast = getUnit(fileName);
if (ast == NULL) {
NodeTree<Symbol>* parseTree = parseAndTrim(fileName);
if (!parseTree)
return NULL;
//Call with ourself to allow the transformation to call us to import files that it needs
if (!only_parse)
ast = ASTTransformer->firstPass(fileName, parseTree); //This firstPass will register itself
}
return ast;
}
void Importer::import(std::string fileName) {
//Start the ball rolling by importing and running the first pass on the first file.
//This will import, first pass and register all the other files too.
//std::cout << "\n\n =====FIRST PASS===== \n\n" << std::endl;
importFirstPass(fileName); //First pass defines all objects
if (only_parse)
return;
std::cout << "\n\n =====SECOND PASS===== \n\n" << std::endl;
for (importTriplet i : importedTrips) //Second pass defines data inside objects, outside declaration statements,
std::cout << "\n\nSecond pass for: " << i.name << std::endl, ASTTransformer->secondPass(i.ast, i.syntaxTree); //function prototypes, and identifiers (as we now have all type defs)
std::cout << "\n\n =====THIRD PASS===== \n\n" << std::endl;
for (importTriplet i : importedTrips) //Third pass does all function bodies
std::cout << "\n\nThird pass for: " << i.name << std::endl, ASTTransformer->thirdPass(i.ast, i.syntaxTree);
std::cout << "\n\n =====FOURTH PASS===== \n\n" << std::endl;
bool changed = true;
while (changed) {
changed = false;
for (importTriplet i : importedTrips) { //Fourth pass finishes up by doing all template classes
std::cout << "\n\nFourth pass for: " << i.name << std::endl;
changed = changed ? changed : ASTTransformer->fourthPass(i.ast, i.syntaxTree);
}
}
//Note that class template instantiation can happen in the second or third passes and that function template instantion
//can happen in the third pass.
std::ofstream outFileAST;
for (importTriplet i : importedTrips) {
std::string outputFileName = outputName + "/" + i.name + "out";
outFileAST.open((outputFileName + ".AST.dot").c_str());
if (!outFileAST.is_open()) {
std::cout << "Problem opening second output file " << outputFileName + ".AST.dot" << "\n";
return;
}
if (i.ast) {
//outFileAST << i.ast->DOTGraphString() << std::endl;
} else {
std::cout << "Tree returned from ASTTransformation for " << fileName << " is NULL!" << std::endl;
}
outFileAST.close();
}
}
NodeTree<Symbol>* Importer::parseAndTrim(std::string fileName) {
std::ifstream programInFile;
//std::ofstream outFile, outFileTransformed;
//std::cout << "outputName " << outputName << std::endl;
//std::cout << "fileName " << fileName << std::endl;
auto pathPieces = split(fileName, '/');
std::string outputFileName = outputName + "/" + pathPieces[pathPieces.size()-1] + "out";
//std::cout << "outputFileName " << outputFileName << std::endl;
std::string inputFileName;
for (auto i : includePaths) {
programInFile.open(i+fileName);
if (programInFile.is_open()) {
inputFileName = i+fileName;
break;
} else {
std::cout << i+fileName << " is no good" << std::endl;
}
}
if (!programInFile.is_open()) {
std::cout << "Problem opening programInFile " << fileName << "\n";
return NULL;
}
//outFile.open(outputFileName);
//if (!outFile.is_open()) {
//std::cout << "Probelm opening output file " << outputFileName << "\n";
//return NULL;
//}
//outFileTransformed.open((outputFileName + ".transformed.dot").c_str());
//if (!outFileTransformed.is_open()) {
//std::cout << "Probelm opening second output file " << outputFileName + ".transformed.dot" << "\n";
//return NULL;
//}
std::string programInputFileString, line;
while(programInFile.good()) {
getline(programInFile, line);
programInputFileString.append(line+"\n");
}
programInFile.close();
//std::cout << programInputFileString << std::endl;
NodeTree<Symbol>* parseTree = parser->parseInput(programInputFileString, inputFileName, !only_parse);
if (parseTree) {
//std::cout << parseTree->DOTGraphString() << std::endl;
//outFile << parseTree->DOTGraphString() << std::endl;
} else {
std::cout << "ParseTree returned from parser for " << fileName << " is NULL!" << std::endl;
//outFile.close(); outFileTransformed.close();
throw "unexceptablblllll";
return NULL;
}
if (only_parse)
return parseTree;
//outFile.close();
//Remove Transformations
for (int i = 0; i < removeSymbols.size(); i++)
parseTree = RemovalTransformation<Symbol>(removeSymbols[i]).transform(parseTree);
//Collapse Transformations
for (int i = 0; i < collapseSymbols.size(); i++)
parseTree = CollapseTransformation<Symbol>(collapseSymbols[i]).transform(parseTree);
if (parseTree) {
//outFileTransformed << parseTree->DOTGraphString() << std::endl;
} else {
std::cout << "Tree returned from transformation is NULL!" << std::endl;
}
//outFileTransformed.close();
std::cout << "Returning parse tree" << std::endl;
return parseTree;
}
std::map<std::string, NodeTree<ASTData>*> Importer::getASTMap() {
return imported;
}

View File

@@ -1,120 +0,0 @@
#include "Lexer.h"
#include <cassert>
Lexer::Lexer() {
//Do nothing
currentPosition = 0;
}
Lexer::Lexer(std::string inputString) {
input = inputString;
currentPosition = 0;
}
Lexer::~Lexer() {
//No cleanup necessary
}
void Lexer::setInput(std::string inputString) {
input = inputString;
}
void Lexer::addRegEx(std::string regExString) {
regExs.push_back(new RegEx(regExString));
}
Symbol Lexer::next() {
//std::cout << "Current at is \"" << input.substr(currentPosition) << "\" currentPos is " << currentPosition << " out of " << input.length() <<std::endl;
//If we're at the end, return an eof
if (currentPosition >= input.length())
return Symbol("$EOF$", true);
int longestMatch = -1;
RegEx* longestRegEx = NULL;
std::string remainingString = input.substr(currentPosition);
for (std::vector<RegEx*>::size_type i = 0; i < regExs.size(); i++) {
//std::cout << "Trying regex " << regExs[i]->getPattern() << std::endl;
int currentMatch = regExs[i]->longMatch(remainingString);
if (currentMatch > longestMatch) {
longestMatch = currentMatch;
longestRegEx = regExs[i];
}
}
if (longestRegEx != NULL) {
std::string eatenString = input.substr(currentPosition, longestMatch);
currentPosition += longestMatch;
//std::cout << "Current at is \"" << input.substr(currentPosition) << "\" currentPos is " << currentPosition <<std::endl;
return Symbol(longestRegEx->getPattern(), true, eatenString);
} else {
// std::cout << "Found no applicable regex" << std::endl;
// std::cout << "Remaining is ||" << input.substr(currentPosition) << "||" << std::endl;
return Symbol("$INVALID$", true);
}
}
void Lexer::test() {
Symbol s;
{
Lexer lex;
lex.addRegEx("b");
lex.setInput("bb");
s = lex.next();
assert(s.getName() == "b" && s.getValue() == "b");
s = lex.next();
assert(s.getName() == "b" && s.getValue() == "b");
assert(lex.next() == Symbol("$EOF$", true));
}
{
Lexer lex;
lex.addRegEx("a*");
lex.addRegEx("b");
lex.setInput("aaabaabb");
s = lex.next();
assert(s.getName() == "a*" && s.getValue() == "aaa");
s = lex.next();
assert(s.getName() == "b" && s.getValue() == "b");
s = lex.next();
assert(s.getName() == "a*" && s.getValue() == "aa");
s = lex.next();
assert(s.getName() == "b" && s.getValue() == "b");
s = lex.next();
assert(s.getName() == "b" && s.getValue() == "b");
assert(lex.next() == Symbol("$EOF$", true));
}
// Test a lexer error condition.
{
Lexer lex;
lex.addRegEx("a|b");
lex.setInput("blah");
s = lex.next();
assert(s.getName() == "a|b" && s.getValue() == "b");
assert(lex.next() == Symbol("$INVALID$", true));
}
// Lexer can consume all the input at once.
{
Lexer lex;
lex.addRegEx("xyzzy");
lex.setInput("xyzzy");
s = lex.next();
assert(s.getName() == "xyzzy" && s.getValue() == "xyzzy");
assert(lex.next() == Symbol("$EOF$", true));
}
// Lexer produces the longest match, not the first.
{
Lexer lex;
lex.addRegEx("int");
lex.addRegEx("(i|n|t|e)+");
lex.setInput("intent");
s = lex.next();
assert(s.getName() == "(i|n|t|e)+" && s.getValue() == "intent");
}
std::cout << "Lexer tests passed\n";
}
void Lexer::reset() {
currentPosition = 0;
}

View File

@@ -1,79 +0,0 @@
#include "ParseAction.h"
ParseAction::ParseAction(ActionType action) {
this->action = action;
this->reduceRule = NULL;
this->shiftState = -1;
}
ParseAction::ParseAction(ActionType action, ParseRule* reduceRule) {
this->action = action;
this->reduceRule = reduceRule;
this->shiftState = -1;
}
ParseAction::ParseAction(ActionType action, int shiftState) {
this->action = action;
this->reduceRule = NULL;
this->shiftState = shiftState;
}
ParseAction::~ParseAction() {
}
const bool ParseAction::equalsExceptLookahead(const ParseAction &other) const {
return( action == other.action && ( reduceRule == other.reduceRule || reduceRule->equalsExceptLookahead(*(other.reduceRule)) ) && shiftState == other.shiftState);
}
const bool ParseAction::operator==(const ParseAction &other) const {
return( action == other.action && ( reduceRule == other.reduceRule || *reduceRule == *(other.reduceRule) ) && shiftState == other.shiftState);
}
const bool ParseAction::operator!=(const ParseAction &other) const {
return !(this->operator==(other));
}
//Exists so we can put ParseActions into sets
const bool ParseAction::operator<(const ParseAction &other) const {
if (action != other.action)
return action < other.action;
if (reduceRule != other.reduceRule) {
if (! (reduceRule && other.reduceRule)) {
return reduceRule < other.reduceRule;
} else {
return *reduceRule < *(other.reduceRule);
}
}
return shiftState < other.shiftState;
}
std::string ParseAction::actionToString(ActionType action) {
switch (action) {
case REDUCE:
return "reduce";
break;
case SHIFT:
return "shift";
break;
case ACCEPT:
return "accept";
break;
case REJECT:
return "reject";
break;
default:
return "INVALID PARSE ACTION";
}
}
std::string ParseAction::toString(bool printRuleLookahead) {
std::string outputString = "";
outputString += actionToString(action);
if (reduceRule != NULL)
outputString += " " + reduceRule->toString(printRuleLookahead);
if (shiftState != -1)
outputString += " " + intToString(shiftState);
return(outputString);
}

View File

@@ -1,145 +0,0 @@
#include "ParseRule.h"
ParseRule::ParseRule() {
pointerIndex = 0;
}
ParseRule::ParseRule(Symbol leftHandle, int pointerIndex, std::vector<Symbol> &rightSide, std::vector<Symbol> lookahead) {
this->leftHandle = leftHandle;
this->pointerIndex = pointerIndex;
this->rightSide = rightSide;
this->lookahead = lookahead;
}
ParseRule::~ParseRule() {
}
const bool ParseRule::equalsExceptLookahead(const ParseRule &other) const {
return(leftHandle == other.leftHandle && rightSide == other.rightSide && pointerIndex == other.pointerIndex);
}
const bool ParseRule::operator==(const ParseRule &other) const {
return(equalsExceptLookahead(other) && (lookahead == other.lookahead));
}
const bool ParseRule::operator!=(const ParseRule &other) const {
return !(this->operator==(other));
}
const bool ParseRule::operator<(const ParseRule &other) const {
//Used for ordering so we can put ParseRule's in sets, and also so that ParseActions will have an ordering
if (leftHandle != other.leftHandle)
return leftHandle < other.leftHandle;
if (rightSide != other.rightSide)
return rightSide < other.rightSide;
if (lookahead != other.lookahead) {
return lookahead < other.lookahead;
}
return false;
}
ParseRule* ParseRule::clone() {
return( new ParseRule(leftHandle, pointerIndex, rightSide, lookahead) );
}
void ParseRule::setLeftHandle(Symbol leftHandle) {
this->leftHandle = leftHandle;
}
void ParseRule::appendToRight(Symbol appendee) {
rightSide.push_back(appendee);
}
Symbol ParseRule::getLeftSide() {
return leftHandle;
}
void ParseRule::setRightSide(std::vector<Symbol> rightSide) {
this->rightSide = rightSide;
}
std::vector<Symbol> ParseRule::getRightSide() {
return rightSide;
}
Symbol ParseRule::getAtNextIndex() {
if (pointerIndex >= rightSide.size())
return Symbol();
return rightSide[pointerIndex];
}
Symbol ParseRule::getAtIndex() {
if (pointerIndex < 1)
return Symbol();
return rightSide[pointerIndex-1];
}
int ParseRule::getRightSize() {
return rightSide.size();
}
int ParseRule::getIndex() {
return pointerIndex;
}
bool ParseRule::advancePointer() {
if (pointerIndex < rightSide.size()) {
pointerIndex++;
return true;
}
return false;
}
bool ParseRule::isAtEnd() {
return pointerIndex == rightSide.size();
}
void ParseRule::setLookahead(std::vector<Symbol> lookahead) {
this->lookahead = lookahead;
}
void ParseRule::addLookahead(std::vector<Symbol> lookahead) {
for (std::vector<Symbol>::size_type i = 0; i < lookahead.size(); i++) {
bool alreadyIn = false;
for (std::vector<Symbol>::size_type j = 0; j < this->lookahead.size(); j++) {
if (lookahead[i] == this->lookahead[j]) {
alreadyIn = true;
break;
}
}
if (!alreadyIn)
this->lookahead.push_back(lookahead[i]);
}
}
std::vector<Symbol> ParseRule::getLookahead() {
return lookahead;
}
std::string ParseRule::toString(bool printLookahead) {
std::string concat = leftHandle.toString() + " -> ";
for (int i = 0; i < rightSide.size(); i++) {
if (i == pointerIndex)
concat += "(*) ";
concat += rightSide[i].toString() + " ";
}
if (pointerIndex >= rightSide.size())
concat += "(*)";
if (printLookahead && lookahead.size()) {
concat += "**";
for (std::vector<Symbol>::size_type i = 0; i < lookahead.size(); i++)
concat += lookahead[i].toString();
concat += "**";
}
return(concat);
}
std::string ParseRule::toDOT() {
std::string concat = "";
for (int i = 0; i < rightSide.size(); i++) {
concat += leftHandle.toString() + " -> " + rightSide[i].toString() + ";\n";
}
return(concat);
}

View File

@@ -1,407 +0,0 @@
#include "Parser.h"
Parser::Parser() : EOFSymbol("$EOF$", true), nullSymbol("$NULL$", true), invalidSymbol("$INVALID$", true){
table.setSymbols(EOFSymbol, nullSymbol);
}
Parser::~Parser() {
}
void Parser::exportTable(std::ofstream &file) {
//Do table
table.exportTable(file);
}
void Parser::importTable(char* tableData) {
//Do table
table.importTable(tableData);
return;
}
Symbol Parser::getOrAddSymbol(std::string symbolString, bool isTerminal) {
Symbol symbol;
std::pair<std::string, bool> entry = std::make_pair(symbolString, isTerminal);
if (symbols.find(entry) == symbols.end()) {
symbol = Symbol(symbolString, isTerminal);
symbols[entry] = symbol;
} else {
symbol = symbols[entry];
}
return(symbol);
}
void Parser::loadGrammer(std::string grammerInputString) {
reader.setString(grammerInputString);
std::string currToken = reader.word(false); //Don't truncate so we can find the newline correctly (needed for comments)
while(currToken != "") {
//First, if this starts with a '#', skip this
if (currToken.front() == '#') {
//If this line is more than one token long, eat it
//std::cout << "Ate: " << currToken << std::endl;
if (currToken.back() != '\n') {
std::string ate = reader.line();
//std::cout << "Eating " << ate << " b/c grammer comment" << std::endl;
}
currToken = reader.word(false);
continue;
}
if (currToken.back() == '\n' || currToken.back() == ' ' || currToken.back() == '\t')
currToken.erase(currToken.size()-1);
//Load the left of the rule
ParseRule* currentRule = new ParseRule();
Symbol leftSide = getOrAddSymbol(currToken, false); //Left handle is never a terminal
currentRule->setLeftHandle(leftSide);
reader.word(); //Remove the =
//Add the right side, adding Symbols to symbol map.
currToken = reader.word();
while (currToken != ";") {
//If there are multiple endings to this rule, finish this rule and start a new one with same left handle
while (currToken == "|") {
//If we haven't added anything, that means that this is a null rule
if (currentRule->getRightSide().size() == 0)
currentRule->appendToRight(nullSymbol);
loadedGrammer.push_back(currentRule);
currentRule = new ParseRule();
currentRule->setLeftHandle(leftSide);
currToken = reader.word();
}
if (currToken == ";")
break;
if (currToken[0] == '\"') {
//Remove the quotes
currToken = currToken.substr(1,currToken.length()-2);
lexer.addRegEx(currToken);
currentRule->appendToRight(getOrAddSymbol(currToken, true)); //If first character is a ", then is a terminal
} else {
currentRule->appendToRight(getOrAddSymbol(currToken, false));
}
currToken = reader.word();
}
//Add new rule to grammer
//If we haven't added anything, that means that this is a null rule
if (currentRule->getRightSide().size() == 0)
currentRule->appendToRight(nullSymbol);
loadedGrammer.push_back(currentRule);
//Get next token
currToken = reader.word(false);
}
//std::cout << "Parsed!\n";
// for (std::vector<ParseRule*>::size_type i = 0; i < loadedGrammer.size(); i++)
// std::cout << loadedGrammer[i]->toString() << std::endl;
}
void Parser::createStateSet() {
std::cout << "Begining creation of stateSet" << std::endl;
//First state has no parents
//Set the first state's basis to be the goal rule with lookahead EOF
ParseRule* goalRule = loadedGrammer[0]->clone();
std::vector<Symbol> goalRuleLookahead;
goalRuleLookahead.push_back(EOFSymbol);
goalRule->setLookahead(goalRuleLookahead);
State* zeroState = new State(0, goalRule);
stateSets.push_back(zeroState);
std::queue<State*> toDo;
toDo.push(zeroState);
//std::cout << "Begining for main set for loop" << std::endl;
int count = 0;
while (toDo.size()) {
if (count % 200 == 0)
std::cout << "while count: " << count << std::endl;
count++;
//closure
closure(toDo.front());
//Add the new states
addStates(&stateSets, toDo.front(), &toDo);
toDo.pop();
}
table.remove(1, EOFSymbol);
}
int Parser::stateNum(State* state) {
for (std::vector<State*>::size_type i = 0; i < stateSets.size(); i++) {
if (*(stateSets[i]) == *state) {
return i;
}
}
return -1;
}
std::vector<Symbol> Parser::firstSet(Symbol token, std::vector<Symbol> avoidList, bool addNewTokens) {
if (tokenFirstSet.find(token) != tokenFirstSet.end())
return tokenFirstSet[token];
//If we've already done this token, don't do it again
for (std::vector<Symbol>::size_type i = 0; i < avoidList.size(); i++)
if (avoidList[i] == token)
return std::vector<Symbol>();
avoidList.push_back(token);
std::vector<Symbol> first;
//First, if the symbol is a terminal, than it's first set is just itself.
if (token.isTerminal()) {
first.push_back(token);
return(first);
}
//Otherwise....
//Ok, to make a first set, go through the grammer, if the token it's left side, add it's production's first token's first set.
//If that one includes mull, do the next one too (if it exists).
Symbol rightToken;
std::vector<Symbol> recursiveFirstSet;
for (std::vector<ParseRule*>::size_type i = 0; i < loadedGrammer.size(); i++) {
if (token == loadedGrammer[i]->getLeftSide()) {
//Loop through the rule adding first sets for each token if the previous token contained NULL
int j = 0;
do {
rightToken = loadedGrammer[i]->getRightSide()[j]; //Get token of the right side of this rule
if (rightToken.isTerminal()) {
recursiveFirstSet.push_back(rightToken);
} else {
//Add the entire set
recursiveFirstSet = firstSet(rightToken, avoidList, false);//Don't add children to cache, as early termination may cause them to be incomplete
}
first.insert(first.end(), recursiveFirstSet.begin(), recursiveFirstSet.end());
j++;
} while (isNullable(rightToken) && loadedGrammer[i]->getRightSide().size() > j);
}
}
if (addNewTokens)
tokenFirstSet[token] = first;
return(first);
}
bool Parser::isNullable(Symbol token) {
if (tokenNullable.find(token) != tokenNullable.end())
return tokenNullable[token];
bool nullable = isNullableHelper(token, std::set<Symbol>());
tokenNullable[token] = nullable;
return nullable;
}
//We use this helper function to recurse because it is possible to wind up with loops, and if so we want
//early termination. However, this means that nullable determinations in the middle of the loop are inaccurate
//(since we terminated early), so we don't want to save them. Thus, for simplicity, only the main method will
//add to the cache. This is somewhat unfortunate for preformance, but the necessary additions to keep track of
//invalidated state are more complicated than it's worth.
bool Parser::isNullableHelper(Symbol token, std::set<Symbol> done) {
if (token.isTerminal())
return token == nullSymbol;
if (done.find(token) != done.end())
return false;
done.insert(token);
if (tokenNullable.find(token) != tokenNullable.end())
return tokenNullable[token];
for (std::vector<ParseRule*>::size_type i = 0; i < loadedGrammer.size(); i++) {
if (token == loadedGrammer[i]->getLeftSide()) {
auto rightSide = loadedGrammer[i]->getRightSide();
bool ruleNullable = true;
for (int j = 0; j < rightSide.size(); j++) {
if (!isNullableHelper(rightSide[j], done)) {
ruleNullable = false;
break;
}
}
if (ruleNullable)
return true;
}
}
return false;
}
//Return the correct lookahead. This followSet is built based on the current rule's lookahead if at end, or the next Symbol's first set.
std::vector<Symbol> Parser::incrementiveFollowSet(ParseRule* rule) {
//Advance the pointer past the current Symbol (the one we want the followset for) to the next symbol (which might be in our follow set, or might be the end)
rule = rule->clone();
rule->advancePointer();
//Get the first set of the next Symbol. If it contains nullSymbol, keep doing for the next one
std::vector<Symbol> followSet;
std::vector<Symbol> symbolFirstSet;
bool symbolFirstSetHasNull = true;
while (symbolFirstSetHasNull && !rule->isAtEnd()) {
symbolFirstSetHasNull = false;
symbolFirstSet = firstSet(rule->getAtNextIndex());
for (std::vector<Symbol>::size_type i = 0; i < symbolFirstSet.size(); i++) {
if (symbolFirstSet[i] == nullSymbol) {
symbolFirstSetHasNull = true;
symbolFirstSet.erase(symbolFirstSet.begin()+i);
break;
}
}
followSet.insert(followSet.end(), symbolFirstSet.begin(), symbolFirstSet.end());
rule->advancePointer();
}
if (rule->isAtEnd()) {
symbolFirstSet = rule->getLookahead();
followSet.insert(followSet.end(), symbolFirstSet.begin(), symbolFirstSet.end());
}
std::vector<Symbol> followSetReturn;
for (std::vector<Symbol>::size_type i = 0; i < followSet.size(); i++) {
bool alreadyIn = false;
for (std::vector<Symbol>::size_type j = 0; j < followSetReturn.size(); j++)
if (followSet[i] == followSetReturn[j]) {
alreadyIn = true;
break;
}
if (!alreadyIn)
followSetReturn.push_back(followSet[i]);
}
delete rule;
return followSetReturn;
}
void Parser::closure(State* state) {
//Add all the applicable rules.
//std::cout << "Closure on " << state->toString() << " is" << std::endl;
std::vector<ParseRule*> stateTotal = state->getTotal();
for (std::vector<ParseRule*>::size_type i = 0; i < stateTotal.size(); i++) {
ParseRule* currentStateRule = stateTotal[i];
//If it's at it's end, move on. We can't advance it.
if(currentStateRule->isAtEnd())
continue;
for (std::vector<ParseRule*>::size_type j = 0; j < loadedGrammer.size(); j++) {
//If the current symbol in the rule is not null (rule completed) and it equals a grammer's left side
ParseRule* currentGramRule = loadedGrammer[j]->clone();
if (currentStateRule->getAtNextIndex() == currentGramRule->getLeftSide()) {
//std::cout << (*stateTotal)[i]->getAtNextIndex()->toString() << " has an applicable production " << loadedGrammer[j]->toString() << std::endl;
//Now, add the correct lookahead. This followSet is built based on the current rule's lookahead if at end, or the next Symbol's first set.
//std::cout << "Setting lookahead for " << currentGramRule->toString() << " in state " << state->toString() << std::endl;
currentGramRule->setLookahead(incrementiveFollowSet(currentStateRule));
//Check to make sure not already in
bool isAlreadyInState = false;
for (std::vector<ParseRule*>::size_type k = 0; k < stateTotal.size(); k++) {
if (stateTotal[k]->equalsExceptLookahead(*currentGramRule)) {
//std::cout << (*stateTotal)[k]->toString() << std::endl;
stateTotal[k]->addLookahead(currentGramRule->getLookahead());
isAlreadyInState = true;
delete currentGramRule;
break;
}
}
if (!isAlreadyInState) {
state->remaining.push_back(currentGramRule);
stateTotal = state->getTotal();
}
} else {
delete currentGramRule;
}
}
}
//std::cout << state->toString() << std::endl;
}
//Adds state if it doesn't already exist.
void Parser::addStates(std::vector< State* >* stateSets, State* state, std::queue<State*>* toDo) {
std::vector< State* > newStates;
//For each rule in the state we already have
std::vector<ParseRule*> currStateTotal = state->getTotal();
for (std::vector<ParseRule*>::size_type i = 0; i < currStateTotal.size(); i++) {
//Clone the current rule
ParseRule* advancedRule = currStateTotal[i]->clone();
//Try to advance the pointer, if sucessful see if it is the correct next symbol
if (advancedRule->advancePointer()) {
//Technically, it should be the set of rules sharing this symbol advanced past in the basis for new state
//So search our new states to see if any of them use this advanced symbol as a base.
//If so, add this rule to them.
//If not, create it.
bool symbolAlreadyInState = false;
for (std::vector< State* >::size_type j = 0; j < newStates.size(); j++) {
if (newStates[j]->basis[0]->getAtIndex() == advancedRule->getAtIndex()) {
symbolAlreadyInState = true;
//So now check to see if this exact rule is in this state
if (!newStates[j]->containsRule(advancedRule))
newStates[j]->basis.push_back(advancedRule);
//We found a state with the same symbol, so stop searching
break;
}
}
if (!symbolAlreadyInState) {
State* newState = new State(stateSets->size()+newStates.size(),advancedRule, state);
newStates.push_back(newState);
}
} else {
delete advancedRule;
}
//Also add any completed rules as reduces in the action table
//See if reduce
//Also, this really only needs to be done for the state's basis, but we're already iterating through, so...
std::vector<Symbol> lookahead = currStateTotal[i]->getLookahead();
if (currStateTotal[i]->isAtEnd()) {
for (std::vector<Symbol>::size_type j = 0; j < lookahead.size(); j++)
table.add(stateNum(state), lookahead[j], new ParseAction(ParseAction::REDUCE, currStateTotal[i]));
} else if (currStateTotal[i]->getAtNextIndex() == nullSymbol) {
//If is a rule that produces only NULL, add in the approprite reduction, but use a new rule with a right side of length 0. (so we don't pop off stack)
ParseRule* nullRule = currStateTotal[i]->clone();
nullRule->setRightSide(std::vector<Symbol>());
for (std::vector<Symbol>::size_type j = 0; j < lookahead.size(); j++)
table.add(stateNum(state), lookahead[j], new ParseAction(ParseAction::REDUCE, nullRule));
}
}
//Put all our new states in the set of states only if they're not already there.
bool stateAlreadyInAllStates = false;
Symbol currStateSymbol;
for (std::vector< State * >::size_type i = 0; i < newStates.size(); i++) {
stateAlreadyInAllStates = false;
currStateSymbol = (*(newStates[i]->getBasis()))[0]->getAtIndex();
for (std::vector< State * >::size_type j = 0; j < stateSets->size(); j++) {
if (newStates[i]->basisEquals(*((*stateSets)[j]))) {
stateAlreadyInAllStates = true;
//If it does exist, we should add it as the shift/goto in the action table
(*stateSets)[j]->addParents(newStates[i]->getParents());
table.add(stateNum(state), currStateSymbol, new ParseAction(ParseAction::SHIFT, j));
break;
}
}
if (!stateAlreadyInAllStates) {
//If the state does not already exist, add it and add it as the shift/goto in the action table
stateSets->push_back(newStates[i]);
toDo->push(newStates[i]);
table.add(stateNum(state), currStateSymbol, new ParseAction(ParseAction::SHIFT, stateSets->size()-1));
}
}
}
std::string Parser::stateSetToString() {
std::string concat = "";
for (std::vector< State *>::size_type i = 0; i < stateSets.size(); i++) {
concat += intToString(i) + " is " + stateSets[i]->toString();
}
return concat;
}
std::string Parser::tableToString() {
return table.toString();
}
//parseInput is now pure virtual
std::string Parser::grammerToString() {
//Iterate through the vector, adding string representation of each grammer rule
std::cout << "About to toString\n";
std::string concat = "";
for (int i = 0; i < loadedGrammer.size(); i++) {
concat += loadedGrammer[i]->toString() + "\n";
}
return(concat);
}
std::string Parser::grammerToDOT() {
//Iterate through the vector, adding DOT representation of each grammer rule
//std::cout << "About to DOT export\n";
std::string concat = "";
for (int i = 0; i < loadedGrammer.size(); i++) {
concat += loadedGrammer[i]->toDOT();
}
return("digraph Kraken_Grammer { \n" + concat + "}");
}

View File

@@ -1,565 +0,0 @@
#include "RNGLRParser.h"
#include <fstream>
//sorry about the macros
#define RESET "\033[0m"
#define BOLDRED "\033[1m\033[31m"
#define BOLDWHITE "\033[1m\033[37m"
#define BOLDGREEN "\033[1m\033[32m"
#define BOLDYELLOW "\033[1m\033[33m"
#define BOLDBLUE "\033[1m\033[34m"
#define BOLDMAGENTA "\033[1m\033[35m"
#define BOLDCYAN "\033[1m\033[36m"
RNGLRParser::RNGLRParser() {
//
}
RNGLRParser::~RNGLRParser() {
//
}
void RNGLRParser::printReconstructedFrontier(int frontier) {
std::vector<int> lastFrontier = gss.getFrontier(frontier);
for (int j = 0; j < lastFrontier.size(); j++) {
std::cout << "State: " << lastFrontier[j] << std::endl;
std::vector<std::pair<std::string, ParseAction>> stateParseActions = table.stateAsParseActionVector(lastFrontier[j]);
std::set<std::pair<std::string, ParseAction>> noRepeats;
for (auto k : stateParseActions)
noRepeats.insert(k);
for (auto k : noRepeats)
std::cout << k.first << " " << k.second.toString(false) << std::endl;
std::cout << std::endl;
}
}
NodeTree<Symbol>* RNGLRParser::parseInput(std::string inputString, std::string filename, bool highlight_errors) {
input.clear();
gss.clear();
while(!toReduce.empty()) toReduce.pop();
while(!toShift.empty()) toReduce.pop();
SPPFStepNodes.clear();
nullableParts.clear();
packedMap.clear();
bool errord = false;
//Check for no tokens
bool accepting = false;
if (inputString == "") {
std::vector<ParseAction*>* zeroStateActions = table.get(0,EOFSymbol);
for (int i = 0; i < zeroStateActions->size(); i++) {
if ((*zeroStateActions)[i]->action == ParseAction::REDUCE)
accepting = true;
}
if (accepting) {
std::cout << "Accepted!" << std::endl;
return getNullableParts((*(stateSets[0]->getBasis()))[0]->getLeftSide());
} else {
std::cerr << "Rejected, no input (with no accepting state)" << std::endl;
}
return new NodeTree<Symbol>();
}
lexer.reset();
lexer.setInput(inputString);
//Now fully lex our input because this algorithm was designed in that manner and simplifies this first implementation.
//It could be converted to on-line later.
int tokenNum = 1;
Symbol currentToken = lexer.next();
input.push_back(currentToken);
while (currentToken != EOFSymbol) {
currentToken = lexer.next();
//std::cout << "CurrentToken is " << currentToken.toString() << std::endl;
if (currentToken == invalidSymbol) {
std::cerr << filename << ":" << findLine(tokenNum) << std::endl;
errord = true;
std::cerr << "lex error" << std::endl;
std::cerr << "Invalid Symbol!" << std::endl;
throw "Invalid Symbol, cannot lex";
}
input.push_back(currentToken);
tokenNum++;
}
// std::cout << "\nDone with Lexing, length:" << input.size() << std::endl;
// std::cout << input[0].toString() << std::endl;
// for (int i = 0; i < input.size(); i++)
// std::cout << "|" << input[i]->toString() << "|";
// std::cout << std::endl;
//std::cout << "Setting up 0th frontier, first actions, toShift, toReduce" << std::endl;
//Frontier 0, new node with state 0
NodeTree<int>* v0 = gss.newNode(0);
gss.addToFrontier(0,v0);
//std::cout << "Done setting up new frontier" << std::endl;
std::vector<ParseAction*> firstActions = *(table.get(0, input[0]));
for (std::vector<ParseAction*>::size_type i = 0; i < firstActions.size(); i++) {
if (firstActions[i]->action == ParseAction::SHIFT)
toShift.push(std::make_pair(v0,firstActions[i]->shiftState));
else if (firstActions[i]->action == ParseAction::REDUCE && fullyReducesToNull(firstActions[i]->reduceRule)) {
Reduction newReduction = {v0, firstActions[i]->reduceRule->getLeftSide(), 0, getNullableParts(firstActions[i]->reduceRule), NULL};
toReduce.push(newReduction);
}
}
// std::cout << "GSS:\n" << gss.toString() << std::endl;
//std::cout << "Starting parse loop" << std::endl;
for (int i = 0; i < input.size(); i++) {
// std::cout << "Checking if frontier " << i << " is empty" << std::endl;
if (gss.frontierIsEmpty(i)) {
//std::cout << "Frontier " << i << " is empty." << std::endl;
//std::cerr << "Parsing failed on " << input[i].toString() << std::endl;
//std::cerr << "Problem is on line: " << findLine(i) << std::endl;
// std::cerr << filename << ":" << findLine(i) << std::endl;
errord = true;
if (highlight_errors)
std::cout << BOLDBLUE;
std::cout << filename << ":" << findLine(i) << std::endl;
if (highlight_errors)
std::cout << BOLDMAGENTA;
std::cout << ": parse error" << std::endl;
std::ifstream infile(filename);
std::string line;
int linecount = 0;
while(std::getline(infile,line))
{
if(linecount == findLine(i) - 1) {
if (highlight_errors)
std::cout << BOLDRED;
std::cout << line << std::endl;
}
linecount++;
}
if (highlight_errors)
std::cout << RESET << std::endl;
break;
}
//Clear the vector of SPPF nodes created every step
SPPFStepNodes.clear();
while (toReduce.size() != 0) {
//std::cout << "Reducing for " << i << std::endl;
//std::cout << "GSS:\n" << gss.toString() << std::endl;
reducer(i);
}
// std::cout << "Shifting for " << i << std::endl;
shifter(i);
//std::cout << "GSS:\n" << gss.toString() << std::endl;
}
//std::cout << "Done with parsing loop, checking for acceptance" << std::endl;
NodeTree<int>* accState = gss.frontierGetAccState(input.size()-1);
if (accState) {
std::cout << "Accepted!" << std::endl;
return gss.getEdge(accState, v0);
}
if (!errord) {
std::cerr << filename << ":" << findLine(input.size())-2 << std::endl;
std::cerr << "parse error" << std::endl;
std::cerr << "Nearby is:" << std::endl;
}
std::cerr << "Rejected!" << std::endl;
// std::cout << "GSS:\n" << gss.toString() << std::endl;
return NULL;
}
void RNGLRParser::reducer(int i) {
Reduction reduction = toReduce.front();
toReduce.pop();
//std::cout << "Doing reduction of length " << reduction.length << " from state " << reduction.from->getData() << " to symbol " << reduction.symbol->toString() << std::endl;
int pathLength = reduction.length > 0 ? reduction.length -1 : 0;
//Get every reachable path
std::vector<std::vector<NodeTree<int>*> >* paths = gss.getReachablePaths(reduction.from, pathLength);
for (std::vector<std::vector<NodeTree<int>*> >::size_type j = 0; j < paths->size(); j++) {
std::vector<NodeTree<int>*> currentPath = (*paths)[j];
//Get the edges for the current path
std::vector<NodeTree<Symbol>*> pathEdges = getPathEdges(currentPath);
std::reverse(pathEdges.begin(), pathEdges.end());
//If the reduction length is 0, label as passed in is null
if (reduction.length != 0)
pathEdges.push_back(reduction.label);
//The end of the current path
NodeTree<int>* currentReached = currentPath[currentPath.size()-1];
//std::cout << "Getting the shift state for state " << currentReached->getData() << " and symbol " << reduction.symbol.toString() << std::endl;
int toState = table.getShift(currentReached->getData(), reduction.symbol)->shiftState;
//If reduction length is 0, then we make the new label the appropriate nullable parts
NodeTree<Symbol>* newLabel = NULL;
if (reduction.length == 0) {
newLabel = reduction.nullableParts;
} else {
//Otherwise, we create the new label if we haven't already
int reachedFrontier = gss.getContainingFrontier(currentReached);
for (std::vector<std::pair<NodeTree<Symbol>*, int> >::size_type k = 0; k < SPPFStepNodes.size(); k++) {
if ( SPPFStepNodes[k].second == reachedFrontier && SPPFStepNodes[k].first->getData() == reduction.symbol) {
newLabel = SPPFStepNodes[k].first;
break;
}
}
if (!newLabel) {
newLabel = new NodeTree<Symbol>("frontier: " + intToString(reachedFrontier), reduction.symbol);
SPPFStepNodes.push_back(std::make_pair(newLabel, reachedFrontier));
}
}
NodeTree<int>* toStateNode = gss.inFrontier(i, toState);
if (toStateNode) {
if (!gss.hasEdge(toStateNode, currentReached)) {
gss.addEdge(toStateNode, currentReached, newLabel);
if (reduction.length != 0) {
//Do all non null reduction
//std::cout << "Checking for non-null reductions in states that already existed" << std::endl;
std::vector<ParseAction*> actions = *(table.get(toState, input[i]));
for (std::vector<ParseAction*>::size_type k = 0; k < actions.size(); k++) {
if (actions[k]->action == ParseAction::REDUCE && !fullyReducesToNull(actions[k]->reduceRule)) {
Reduction newReduction = {currentReached, actions[k]->reduceRule->getLeftSide(), actions[k]->reduceRule->getIndex(), getNullableParts(actions[k]->reduceRule), newLabel};
toReduce.push(newReduction);
}
}
}
}
} else {
toStateNode = gss.newNode(toState);
gss.addToFrontier(i, toStateNode);
gss.addEdge(toStateNode, currentReached, newLabel);
//std::cout << "Adding shifts and reductions for a state that did not exist" << std::endl;
std::vector<ParseAction*> actions = *(table.get(toState, input[i]));
for (std::vector<ParseAction*>::size_type k = 0; k < actions.size(); k++) {
//std::cout << "Action is " << actions[k]->toString() << std::endl;
if (actions[k]->action == ParseAction::SHIFT) {
toShift.push(std::make_pair(toStateNode, actions[k]->shiftState));
} else if (actions[k]->action == ParseAction::REDUCE && fullyReducesToNull(actions[k]->reduceRule)) {
Reduction newReduction = {toStateNode, actions[k]->reduceRule->getLeftSide(), 0, getNullableParts(actions[k]->reduceRule), NULL};
toReduce.push(newReduction);
} else if (reduction.length != 0 && actions[k]->action == ParseAction::REDUCE && !fullyReducesToNull(actions[k]->reduceRule)) {
Reduction newReduction = {currentReached, actions[k]->reduceRule->getLeftSide(), actions[k]->reduceRule->getIndex(), getNullableParts(actions[k]->reduceRule), newLabel};
toReduce.push(newReduction);
}
}
}
if (reduction.length != 0)
addChildren(newLabel, &pathEdges, reduction.nullableParts);
}
}
void RNGLRParser::shifter(int i) {
if (i != input.size()-1) {
std::queue< std::pair<NodeTree<int>*, int> > nextShifts;
NodeTree<Symbol>* newLabel = new NodeTree<Symbol>("frontier: " + intToString(i), input[i]);
while (!toShift.empty()) {
std::pair<NodeTree<int>*, int> shift = toShift.front();
toShift.pop();
//std::cout << "Current potential shift from " << shift.first->getData() << " to " << shift.second << std::endl;
NodeTree<int>* shiftTo = gss.inFrontier(i+1, shift.second);
if (shiftTo) {
//std::cout << "State already existed, just adding edge" << std::endl;
gss.addEdge(shiftTo, shift.first, newLabel);
std::vector<ParseAction*> actions = *(table.get(shift.second, input[i+1]));
for (std::vector<ParseAction*>::size_type j = 0; j < actions.size(); j++) {
if (actions[j]->action == ParseAction::REDUCE && !fullyReducesToNull(actions[j]->reduceRule)) {
Reduction newReduction = {shift.first, actions[j]->reduceRule->getLeftSide(), actions[j]->reduceRule->getIndex(), getNullableParts(actions[j]->reduceRule), newLabel};
toReduce.push(newReduction);
}
}
} else {
//std::cout << "State did not already exist, adding" << std::endl;
shiftTo = gss.newNode(shift.second);
gss.addToFrontier(i+1, shiftTo);
gss.addEdge(shiftTo, shift.first, newLabel);
std::vector<ParseAction*> actions = *(table.get(shift.second, input[i+1]));
for (std::vector<ParseAction*>::size_type j = 0; j < actions.size(); j++) {
//std::cout << "Adding action " << actions[j]->toString() << " to either nextShifts or toReduce" << std::endl;
//Shift
if (actions[j]->action == ParseAction::SHIFT) {
nextShifts.push(std::make_pair(shiftTo, actions[j]->shiftState));
} else if (actions[j]->action == ParseAction::REDUCE && !fullyReducesToNull(actions[j]->reduceRule)) {
Reduction newReduction = {shift.first, actions[j]->reduceRule->getLeftSide(), actions[j]->reduceRule->getIndex(), getNullableParts(actions[j]->reduceRule), newLabel};
toReduce.push(newReduction);
} else if (actions[j]->action == ParseAction::REDUCE && fullyReducesToNull(actions[j]->reduceRule)) {
Reduction newReduction = {shiftTo, actions[j]->reduceRule->getLeftSide(), 0, getNullableParts(actions[j]->reduceRule), NULL};
toReduce.push(newReduction);
}
}
}
}
toShift = nextShifts;
}
}
void RNGLRParser::addChildren(NodeTree<Symbol>* parent, std::vector<NodeTree<Symbol>*>* children, NodeTree<Symbol>* nullableParts) {
if (nullableParts)
children->push_back(nullableParts);
if (!belongsToFamily(parent, children)) {
if (parent->getChildren().size() == 0) {
parent->addChildren(children);
} else {
if (!arePacked(parent->getChildren())) {
NodeTree<Symbol>* subParent = new NodeTree<Symbol>("AmbiguityPackInner", Symbol("AmbiguityPackInner", true));
setPacked(subParent, true);
std::vector<NodeTree<Symbol>*> tmp = parent->getChildren();
subParent->addChildren(&tmp);
parent->clearChildren();
parent->addChild(subParent);
}
NodeTree<Symbol>* t = new NodeTree<Symbol>("AmbiguityPackOuter", Symbol("AmbiguityPackInner", true));
setPacked(t, true);
parent->addChild(t);
t->addChildren(children);
}
}
}
bool RNGLRParser::belongsToFamily(NodeTree<Symbol>* node, std::vector<NodeTree<Symbol>*>* nodes) {
//std::cout << "Checking " << node->getData()->toString() << "'s family" << std::endl;
std::vector<NodeTree<Symbol>*> children = node->getChildren();
for (std::vector<NodeTree<Symbol>*>::size_type i = 0; i < nodes->size(); i++) {
bool containsOne = false;
for (std::vector<NodeTree<Symbol>*>::size_type j = 0; j < children.size(); j++) {
//Not sure where null comes from. For right now, just check to be sure we don't segfault
if ((*nodes)[i] == children[j] || ( (*nodes)[i] != NULL && children[j] != NULL && (*(*nodes)[i]) == *(children[j]) )) {
containsOne = true;
break;
}
}
if (!containsOne) {
return false;
}
}
return true;
}
bool RNGLRParser::arePacked(std::vector<NodeTree<Symbol>*> nodes) {
bool packed = true;
for (std::vector<NodeTree<Symbol>*>::size_type i = 0; i < nodes.size(); i++)
packed &= packedMap[*(nodes[i])];
return packed;
}
bool RNGLRParser::isPacked(NodeTree<Symbol>* node) {
return packedMap[*node];
}
void RNGLRParser::setPacked(NodeTree<Symbol>* node, bool isPacked) {
packedMap[*node] = isPacked;
}
//Have to use own add states function in order to construct RN table instead of LALR table
void RNGLRParser::addStates(std::vector< State* >* stateSets, State* state, std::queue<State*>* toDo) {
std::vector< State* > newStates;
//For each rule in the state we already have
std::vector<ParseRule*> currStateTotal = state->getTotal();
for (std::vector<ParseRule*>::size_type i = 0; i < currStateTotal.size(); i++) {
//Clone the current rule
ParseRule* advancedRule = currStateTotal[i]->clone();
//Try to advance the pointer, if sucessful see if it is the correct next symbol
if (advancedRule->advancePointer()) {
//Technically, it should be the set of rules sharing this symbol advanced past in the basis for new state
//So search our new states to see if any of them use this advanced symbol as a base.
//If so, add this rule to them.
//If not, create it.
bool symbolAlreadyInState = false;
for (std::vector< State* >::size_type j = 0; j < newStates.size(); j++) {
if (newStates[j]->basis[0]->getAtIndex() == advancedRule->getAtIndex()) {
symbolAlreadyInState = true;
//Add rule to state, combining with idenical rule except lookahead if exists
newStates[j]->addRuleCombineLookahead(advancedRule);
//We found a state with the same symbol, so stop searching
break;
}
}
if (!symbolAlreadyInState) {
State* newState = new State(stateSets->size()+newStates.size(),advancedRule, state);
newStates.push_back(newState);
}
} else {
delete advancedRule;
}
}
//Put all our new states in the set of states only if they're not already there.
bool stateAlreadyInAllStates = false;
Symbol currStateSymbol;
for (std::vector< State * >::size_type i = 0; i < newStates.size(); i++) {
stateAlreadyInAllStates = false;
currStateSymbol = (*(newStates[i]->getBasis()))[0]->getAtIndex();
for (std::vector< State * >::size_type j = 0; j < stateSets->size(); j++) {
if (newStates[i]->basisEqualsExceptLookahead(*((*stateSets)[j]))) {
//if (newStates[i]->basisEquals(*((*stateSets)[j]))) {
stateAlreadyInAllStates = true;
//If it does exist, we should add it as the shift/goto in the action table
//std::cout << "newStates[" << i << "] == stateSets[" << j << "]" << std::endl;
if (!((*stateSets)[j]->basisEquals(*(newStates[i]))))
toDo->push((*stateSets)[j]);
(*stateSets)[j]->combineStates(*(newStates[i]));
//std::cout << j << "\t Hay, doing an inside loop state reductions!" << std::endl;
addStateReductionsToTable((*stateSets)[j]);
table.add(stateNum(state), currStateSymbol, new ParseAction(ParseAction::SHIFT, j));
break;
}
}
if (!stateAlreadyInAllStates) {
//If the state does not already exist, add it and add it as the shift/goto in the action table
stateSets->push_back(newStates[i]);
toDo->push(newStates[i]);
table.add(stateNum(state), currStateSymbol, new ParseAction(ParseAction::SHIFT, stateSets->size()-1));
}
}
addStateReductionsToTable(state);
}
void RNGLRParser::addStateReductionsToTable(State* state) {
std::vector<ParseRule*> currStateTotal = state->getTotal();
//std::cout << currStateTotal->size() << "::" << state->getNumber() << std::endl;
for (std::vector<ParseRule*>::size_type i = 0; i < currStateTotal.size(); i++) {
//See if reduce
//Also, this really only needs to be done for the state's basis, but we're already iterating through, so...
std::vector<Symbol> lookahead = currStateTotal[i]->getLookahead();
if (currStateTotal[i]->isAtEnd()) {
for (std::vector<Symbol>::size_type j = 0; j < lookahead.size(); j++) {
table.add(stateNum(state), lookahead[j], new ParseAction(ParseAction::REDUCE, currStateTotal[i]));
}
//If this has an appropriate ruduction to null, get the reduce trees out
} else if (reducesToNull(currStateTotal[i])) {
//std::cout << (*currStateTotal)[i]->toString() << " REDUCES TO NULL" << std::endl;
//It used to be that if is a rule that produces only NULL, add in the approprite reduction, but use a new rule with a right side that is equal to
//the part that we've already gone through in the rule. (so we don't pop extra off stack)
//Now we use the same rule and make sure that the index location is used
for (std::vector<Symbol>::size_type j = 0; j < lookahead.size(); j++)
table.add(stateNum(state), lookahead[j], new ParseAction(ParseAction::REDUCE, currStateTotal[i]));
}
}
}
bool RNGLRParser::fullyReducesToNull(ParseRule* rule) {
return rule->getIndex() == 0 && reducesToNull(rule);
}
bool RNGLRParser::reducesToNull(ParseRule* rule) {
auto itr = reduceToNullMap.find(rule);
if (itr != reduceToNullMap.end())
return itr->second;
std::vector<Symbol> avoidList;
auto val = reducesToNull(rule, avoidList);
reduceToNullMap[rule] = val;
return val;
}
bool RNGLRParser::reducesToNull(ParseRule* rule, std::vector<Symbol> avoidList) {
//If the rule is completed and not null, it doesn't reduce to null, it's just completed.
if (rule->isAtEnd() && rule->getRightSize() != 0)
return false;
for (std::vector<Symbol>::size_type i = 0; i < avoidList.size(); i++)
if (rule->getLeftSide() == avoidList[i])
return false;
avoidList.push_back(rule->getLeftSide());
std::vector<Symbol> rightSide = rule->getRightSide();
bool reduces = true;
for (std::vector<Symbol>::size_type i = rule->getIndex(); i < rightSide.size(); i++) {
if (rightSide[i] == nullSymbol)
continue;
if (rightSide[i].isTerminal()) {
reduces = false;
break;
}
bool subSymbolReduces = false;
for (std::vector<ParseRule*>::size_type j = 0; j < loadedGrammer.size(); j++) {
if (loadedGrammer[j]->getLeftSide() == rightSide[i]) {
if(reducesToNull(loadedGrammer[j], avoidList)) {
subSymbolReduces = true;
break;
}
}
}
if (!subSymbolReduces) {
reduces = false;
break;
}
}
return reduces;
}
NodeTree<Symbol>* RNGLRParser::getNullableParts(ParseRule* rule) {
return getNullableParts(rule, std::vector<NodeTree<Symbol>*>());
}
NodeTree<Symbol>* RNGLRParser::getNullableParts(ParseRule* rule, std::vector<NodeTree<Symbol>*> avoidList) {
if (reducesToNull(rule)) {
//std::cout << "Reduces to null so adding parts " << rule->toString() << std::endl;
Symbol symbol = rule->getLeftSide();
NodeTree<Symbol>* symbolNode = new NodeTree<Symbol>(symbol.getName(), symbol);
if (rule->getAtNextIndex() == nullSymbol) {
symbolNode->addChild(new NodeTree<Symbol>(nullSymbol.getName(), nullSymbol));
} else {
//Find recursively
ParseRule* iterate = rule->clone();
while (!iterate->isAtEnd()) {
//Check to see if we've done this symbol already, if so use it
for (std::vector<NodeTree<Symbol>*>::size_type i = 0; i < avoidList.size(); i++) {
if (iterate->getAtNextIndex() == avoidList[i]->getData()) {
symbolNode->addChild(avoidList[i]);
break;
}
}
//We haven't so do it recursively
for (std::vector<ParseRule*>::size_type i = 0; i < loadedGrammer.size(); i++) {
if (fullyReducesToNull(loadedGrammer[i]) && iterate->getAtNextIndex() == loadedGrammer[i]->getLeftSide()) {
NodeTree<Symbol>* symbolTree = getNullableParts(loadedGrammer[i], avoidList);
avoidList.push_back(symbolTree);
symbolNode->addChild(symbolTree);
}
}
iterate->advancePointer();
}
}
return symbolNode;
}
return NULL;
}
NodeTree<Symbol>* RNGLRParser::getNullableParts(Symbol symbol) {
return new NodeTree<Symbol>("CRAZY_SYMBOL", nullSymbol);
}
std::vector<NodeTree<Symbol>*> RNGLRParser::getPathEdges(std::vector<NodeTree<int>*> path) {
std::vector<NodeTree<Symbol>*> pathEdges;
for (std::vector<NodeTree<int>*>::size_type i = 0; i < path.size()-1; i++)
pathEdges.push_back(gss.getEdge(path[i], path[i+1]));
return pathEdges;
}
int RNGLRParser::findLine(int tokenNum) {
int lineNo = 1;
for (int i = 0; i < tokenNum; i++) {
std::string tokenString = input[i].getValue();
for (int j = 0; j < tokenString.size(); j++)
if (tokenString[j] == '\n')
lineNo++;
}
return lineNo;
}

View File

@@ -1,225 +0,0 @@
#include "RegEx.h"
#include <cassert>
RegEx::RegEx(std::string inPattern) {
pattern = inPattern;
std::vector<RegExState*> ending;
begin = construct(&ending, inPattern);
//last one is goal state, add it to the end of all of these last states
for (std::vector<RegExState*>::size_type i = 0; i < ending.size(); i++)
ending[i]->addNext(NULL);
}
RegExState* RegEx::construct(std::vector<RegExState*>* ending, std::string pattern) {
//In the RegEx re-write, instead of doing complicated unperenthesising, we keep track of both the "front" and the "end" of a state.
//(these could be different if the state is perenthesezed)
std::vector<RegExState*> previousStatesBegin;
std::vector<RegExState*> previousStatesEnd;
std::vector<RegExState*> currentStatesBegin;
std::vector<RegExState*> currentStatesEnd;
bool alternating = false;
RegExState* begin = new RegExState();
currentStatesBegin.push_back(begin);
currentStatesEnd.push_back(begin);
for (int i = 0; i < pattern.length(); i++) {
switch (pattern[i]) {
case '*':
{
//std::cout << "Star at " << i << " in " << pattern << std::endl;
//NOTE: Because of the re-write, this is necessary again
for (std::vector<RegExState*>::size_type j = 0; j < currentStatesEnd.size(); j++)
for (std::vector<RegExState*>::size_type k = 0; k < currentStatesBegin.size(); k++)
currentStatesEnd[j]->addNext(currentStatesBegin[k]); //Make the ends point to the beginnings
//add all previous states to current states to enable skipping over the starred item
currentStatesBegin.insert(currentStatesBegin.end(), previousStatesBegin.begin(), previousStatesBegin.end());
currentStatesEnd.insert(currentStatesEnd.end(), previousStatesEnd.begin(), previousStatesEnd.end());
}
break;
case '+':
{
//std::cout << "Plus at " << i << " in " << pattern << std::endl;
//NOTE: Because of the re-write, this is necessary again
for (std::vector<RegExState*>::size_type j = 0; j < currentStatesEnd.size(); j++)
for (std::vector<RegExState*>::size_type k = 0; k < currentStatesBegin.size(); k++)
currentStatesEnd[j]->addNext(currentStatesBegin[k]); //Make the ends point to the beginnings
}
break;
case '?':
{
//std::cout << "Question at " << i << " in " << pattern << std::endl;
//add all previous states to current states to enable skipping over the questioned item
currentStatesBegin.insert(currentStatesBegin.end(), previousStatesBegin.begin(), previousStatesBegin.end());
currentStatesEnd.insert(currentStatesEnd.end(), previousStatesEnd.begin(), previousStatesEnd.end());
}
break;
case '|':
{
//std::cout << "Alternation at " << i << " in " << pattern << std::endl;
//alternation
alternating = true;
}
break;
case '(':
{
//std::cout << "Begin peren at " << i << " in " << pattern << std::endl;
//perentheses
std::vector<RegExState*> innerEnds;
int perenEnd = findPerenEnd(pattern, i);
RegExState* innerBegin = construct(&innerEnds, strSlice(pattern, i+1, perenEnd));
i = perenEnd;
std::vector<RegExState*> innerBegins = innerBegin->getNextStates();
if (alternating) {
for (std::vector<RegExState*>::size_type j = 0; j < previousStatesEnd.size(); j++)
for (std::vector<RegExState*>::size_type k = 0; k < innerBegins.size(); k++)
previousStatesEnd[j]->addNext(innerBegins[k]);
currentStatesBegin.insert(currentStatesBegin.end(), innerBegins.begin(), innerBegins.end());
currentStatesEnd.insert(currentStatesEnd.end(), innerEnds.begin(), innerEnds.end());
} else {
for (std::vector<RegExState*>::size_type j = 0; j < currentStatesEnd.size(); j++)
for (std::vector<RegExState*>::size_type k = 0; k < innerBegins.size(); k++)
currentStatesEnd[j]->addNext(innerBegins[k]);
previousStatesBegin = currentStatesBegin;
previousStatesEnd = currentStatesEnd;
currentStatesBegin = innerBegins;
currentStatesEnd = innerEnds;
}
alternating = false;
}
break;
// ) does not need a case as we skip over it after finding it in ('s case
case '\\':
{
i++;
//std::cout << "Escape! Escaping: " << pattern[i] << std::endl;
//Ahh, it's escaping a special character, so fall through to the default.
}
default:
{
//std::cout << "Regular" << std::endl;
//Ahh, it's regular
RegExState* next = new RegExState(pattern[i]);
//If we're alternating, add next as the next for each previous state, and add self to currentStates
if (alternating) {
for (std::vector<RegExState*>::size_type j = 0; j < previousStatesEnd.size(); j++)
previousStatesEnd[j]->addNext(next);
currentStatesBegin.push_back(next);
currentStatesEnd.push_back(next);
alternating = false;
} else {
//If we're not alternating, add next as next for all the current states, make the current states the new
//previous states, and add ourself as the new current state.
for (std::vector<RegExState*>::size_type j = 0; j < currentStatesEnd.size(); j++)
currentStatesEnd[j]->addNext(next);
previousStatesBegin.clear();
previousStatesEnd.clear();
previousStatesBegin = currentStatesBegin;
previousStatesEnd = currentStatesEnd;
currentStatesBegin.clear();
currentStatesEnd.clear();
currentStatesBegin.push_back(next);
currentStatesEnd.push_back(next);
}
}
}
}
(*ending) = currentStatesEnd;
return(begin);
}
RegEx::~RegEx() {
//No cleanup necessary
}
int RegEx::longMatch(std::string stringToMatch) {
// Start in the begin state (only).
int lastMatch = -1;
currentStates.clear();
currentStates.push_back(begin);
std::vector<RegExState*> nextStates;
for (int i = 0; i < stringToMatch.size(); i++) {
//Go through every current state. Check to see if it is goal, if so update last goal.
//Also, add each state's advance to nextStates
for (std::vector<RegExState*>::size_type j = 0; j < currentStates.size(); j++) {
if (currentStates[j]->isGoal())
lastMatch = i;
std::vector<RegExState*> addStates = currentStates[j]->advance(stringToMatch.at(i));
nextStates.insert(nextStates.end(), addStates.begin(), addStates.end());
}
//Now, clear our current states and add eaczh one of our addStates if it is not already in current states
currentStates.clear();
for (std::vector<RegExState*>::size_type j = 0; j < nextStates.size(); j++) {
bool inCurrStates = false;
for (std::vector<RegExState*>::size_type k = 0; k < currentStates.size(); k++) {
if (nextStates[j] == currentStates[k])
inCurrStates = true;
}
if (!inCurrStates)
currentStates.push_back(nextStates[j]);
}
// if (currentStates.size() != 0)
// std::cout << "Matched " << i << " character: " << stringToMatch[i-1] << std::endl;
nextStates.clear();
//If we can't continue matching, just return our last matched
if (currentStates.size() == 0)
break;
}
//Check to see if we match on the last character in the string
for (std::vector<RegExState*>::size_type j = 0; j < currentStates.size(); j++) {
if (currentStates[j]->isGoal())
lastMatch = stringToMatch.size();
}
return lastMatch;
}
std::string RegEx::getPattern() {
return pattern;
}
std::string RegEx::toString() {
return pattern + " -> " + begin->toString();
}
void RegEx::test() {
{
RegEx re("a*");
assert(re.longMatch("a") == 1);
assert(re.longMatch("aa") == 2);
assert(re.longMatch("aaaab") == 4);
assert(re.longMatch("b") == 0);
}
{
RegEx re("a+");
assert(re.longMatch("aa") == 2);
assert(re.longMatch("aaaab") == 4);
assert(re.longMatch("b") == -1);
}
{
RegEx re("a(bc)?");
assert(re.longMatch("ab") == 1);
}
{
RegEx re("((ab)|c)*");
assert(re.longMatch("ababc") == 5);
assert(re.longMatch("ad") == 0);
assert(re.longMatch("ababccd") == 6);
}
{
RegEx re("bbb((bba+)|(ba+))*a*((a+b)|(a+bb)|(a+))*bbb") ;
assert(re.longMatch("bbbababbbaaaaaaaaaaaaaaaaaaabbb") == 9);
}
std::cout << "RegEx tests pass\n";
}

View File

@@ -1,82 +0,0 @@
#include "RegExState.h"
RegExState::RegExState(char inCharacter) {
character = inCharacter;
}
RegExState::RegExState() {
character = 0;
}
RegExState::~RegExState() {
//No cleanup necessary
}
void RegExState::addNext(RegExState* nextState) {
nextStates.push_back(nextState);
}
bool RegExState::characterIs(char inCharacter) {
return character == inCharacter;
}
std::vector<RegExState*> RegExState::advance(char advanceCharacter) {
std::vector<RegExState*> advanceStates;
for (std::vector<RegExState*>::size_type i = 0; i < nextStates.size(); i++) {
if (nextStates[i] != NULL && nextStates[i]->characterIs(advanceCharacter))
advanceStates.push_back(nextStates[i]);
}
return advanceStates;
}
std::vector<RegExState*> RegExState::getNextStates() {
return nextStates;
}
bool RegExState::isGoal() {
for (std::vector<RegExState*>::size_type i = 0; i < nextStates.size(); i++)
if (nextStates[i] == NULL)
return true;
return false;
}
std::string RegExState::toString() {
std::vector<RegExState*> avoidList;
return toString(&avoidList);
}
std::string RegExState::toString(RegExState* avoid) {
std::vector<RegExState*> avoidList;
avoidList.push_back(avoid);
return toString(&avoidList);
}
std::string RegExState::toString(std::vector<RegExState*>* avoid) {
avoid->push_back(this);
std::string string = "";
string += std::string("\"") + character + "\"";
for (std::vector<RegExState*>::size_type i = 0; i < nextStates.size(); i++) {
bool inAvoid = false;
for (std::vector<RegExState*>::size_type j = 0; j < avoid->size(); j++) {
if (nextStates[i] == (*avoid)[j]) {
inAvoid = true;
}
}
if (inAvoid) {
string += "->loop";
continue;
}
if (nextStates[i] != this && nextStates[i] != NULL)
string += "->" + nextStates[i]->toString(avoid) + " EC ";
else if (nextStates[i] == NULL)
string += "-> GOAL ";
else
string += "->this";
}
return string;
}
char RegExState::getCharacter() {
return character;
}

View File

@@ -1,164 +0,0 @@
#include "State.h"
State::State(int number, ParseRule* basis) {
this->number = number;
this->basis.push_back(basis);
}
State::State(int number, ParseRule* basis, State* parent) {
this->number = number;
this->basis.push_back(basis);
parents.push_back(parent);
}
State::~State() {
}
const bool State::operator==(const State &other) {
//return (basis == other.basis && remaining == other.remaining);
if (basis.size() != other.basis.size())
return false;
for (std::vector< ParseRule* >::size_type i = 0; i < basis.size(); i++) {
if (*(basis[i]) != *(other.basis[i]))
return false;
}
if (remaining.size() != other.remaining.size())
return false;
for (std::vector< ParseRule* >::size_type i = 0; i < remaining.size(); i++) {
if ( *(remaining[i]) != *(other.remaining[i]) )
return false;
}
return true;
}
const bool State::operator!=(const State &other) {
return !(this->operator==(other));
}
const bool State::basisEquals(const State &other) {
//return (basis == other.basis && remaining == other.remaining);
if (basis.size() != other.basis.size())
return false;
for (std::vector< ParseRule* >::size_type i = 0; i < basis.size(); i++) {
if (*(basis[i]) != (*(other.basis[i])))
return false;
}
return true;
}
const bool State::basisEqualsExceptLookahead(const State &other) {
//return (basis == other.basis && remaining == other.remaining);
if (basis.size() != other.basis.size())
return false;
for (std::vector< ParseRule* >::size_type i = 0; i < basis.size(); i++) {
if (!basis[i]->equalsExceptLookahead(*(other.basis[i])))
return false;
}
return true;
}
void State::combineStates(State &other) {
for (std::vector< ParseRule* >::size_type i = 0; i < other.basis.size(); i++) {
bool alreadyIn = false;
for (std::vector< ParseRule* >::size_type j = 0; j < basis.size(); j++) {
if (basis[j]->equalsExceptLookahead(*(other.basis[i]))) {
basis[j]->addLookahead(other.basis[i]->getLookahead());
alreadyIn = true;
}
}
if (!alreadyIn)
basis.push_back(other.basis[i]);
}
addParents(other.getParents());
}
std::vector<ParseRule*> State::getTotal() {
std::vector<ParseRule*> total;
total.insert(total.begin(), basis.begin(), basis.end());
total.insert(total.end(), remaining.begin(), remaining.end());
return total;
}
std::vector<ParseRule*>* State::getBasis() {
return &basis;
}
std::vector<ParseRule*>* State::getRemaining() {
return &remaining;
}
bool State::containsRule(ParseRule* rule) {
auto total = getTotal();
for (std::vector<ParseRule*>::size_type i = 0; i < total.size(); i++) {
if (*rule == *(total[i])) {
return true;
}
}
return false;
}
void State::addRuleCombineLookahead(ParseRule* rule) {
auto total = getTotal();
bool alreadyIn = false;
for (std::vector<ParseRule*>::size_type i = 0; i < total.size(); i++) {
if (rule->equalsExceptLookahead(*(total[i]))) {
total[i]->addLookahead(rule->getLookahead());
alreadyIn = true;
break;
}
}
if (!alreadyIn)
basis.push_back(rule);
}
std::string State::toString() {
std::string concat = "";
concat += "State " + intToString(number) + " with " + intToString(parents.size()) + " parents:\n";
for (std::vector<ParseRule*>::size_type j = 0; j < basis.size(); j++) {
concat += "\t" + basis[j]->toString() + "\n";
}
for (std::vector<ParseRule*>::size_type j = 0; j < remaining.size(); j++) {
concat += "\t+\t" + remaining[j]->toString() + "\n";
}
return concat;
}
void State::addParents(std::vector<State*>* parents) {
bool alreadyIn = false;
for (std::vector<State*>::size_type i = 0; i < parents->size(); i++) {
alreadyIn = false;
for (std::vector<State*>::size_type j = 0; j < this->parents.size(); j++) {
if (this->parents[j]->basisEquals(*((*parents)[i]))) {
alreadyIn = true;
break;
}
}
if (!alreadyIn)
this->parents.push_back((*parents)[i]);
}
}
std::vector<State*>* State::getParents() {
return &parents;
}
std::vector<State*>* State::getDeepParents(int depth) {
if (depth <= 0) {
std::vector<State*>* returnSelf = new std::vector<State*>();
returnSelf->push_back(this);
return returnSelf;
}
std::vector<State*>* recursiveParents = new std::vector<State*>();
std::vector<State*>* recursiveParentsToAdd;
for (std::vector<State*>::size_type i = 0; i < parents.size(); i++) {
recursiveParentsToAdd = parents[i]->getDeepParents(depth-1);
recursiveParents->insert(recursiveParents->end(), recursiveParentsToAdd->begin(), recursiveParentsToAdd->end());
}
return recursiveParents;
}
int State::getNumber() {
return number;
}

View File

@@ -1,166 +0,0 @@
#include "StringReader.h"
#include <cassert>
StringReader::StringReader()
{
str_pos = 0;
}
StringReader::StringReader(std::string inputString)
{
str_pos = 0;
setString(inputString);
}
StringReader::~StringReader()
{
//dtor
}
void StringReader::setString(std::string inputString)
{
rd_string = inputString;
end_reached = false;
}
std::string StringReader::word(bool truncateEnd)
{
std::string result = getTokens(" \n\t", truncateEnd);
while (result == " " || result == "\n" || result == "\t")
{
result = getTokens(" \n\t", truncateEnd);
}
return(result);
}
std::string StringReader::line(bool truncateEnd)
{
return getTokens("\n", truncateEnd);
}
std::string StringReader::getTokens(const char *stop_chars, bool truncateEnd)
{
if (str_pos >= rd_string.size())
return "";
size_t found_pos = rd_string.find_first_of(stop_chars, str_pos);
if (rd_string[str_pos] == '\"') {
//Find the next quote
found_pos = rd_string.find("\"", str_pos+1);
//Check to see if the quote is escaped
int numBackslashes = 0;
int countBack = 1;
while (found_pos >= countBack && rd_string[found_pos-countBack] == '\\') {
numBackslashes++;
countBack++;
}
//While the quote is escaped
while (numBackslashes % 2 == 1) {
//find the next quote
found_pos = rd_string.find("\"", found_pos+1);
//Check to see if it's escaped
numBackslashes = 0;
countBack = 1;
while (found_pos >= countBack && rd_string[found_pos-countBack] == '\\') {
numBackslashes++;
countBack++;
}
}
}
if (found_pos == str_pos) //We are at the endline
{
std::string stop_char(1, rd_string[str_pos]);
str_pos++;
return stop_char;
} else if (found_pos == std::string::npos) //We are at the end of the file
{
//End of String
end_reached = true;
//std::cout << "Reached end of file!\n";
return "";
} else {
if (truncateEnd) //If we want to get rid of the delimiting character, which is the default, don't add the last char. Note we have to increase str_pos by one manually later
found_pos -= 1;
if (rd_string[str_pos] == '\"')
found_pos++;
std::string string_section = rd_string.substr(str_pos, found_pos - str_pos + 1);
str_pos = found_pos + 1;
if (truncateEnd) //Ok, we didn't add the last char, but str_pos now points at that char. So we move it one ahead.
str_pos++;
return string_section;
}
}
void StringReader::test()
{
{
StringReader reader("\"x\"");
assert(reader.word() == "\"x\"");
assert(reader.word() == "");
}
{
StringReader reader("\"y\" ;\n");
assert(reader.word() == "\"y\"");
assert(reader.word() == ";");
assert(reader.word() == "");
}
{
StringReader reader("Goal = greeting ;\n"
"greeting = \"hello\" | greeting \"world\" ;\n");
assert(reader.word() == "Goal");
assert(reader.word() == "=");
assert(reader.word() == "greeting");
assert(reader.word() == ";");
assert(reader.word() == "greeting");
assert(reader.word() == "=");
assert(reader.word() == "\"hello\"");
assert(reader.word() == "|");
assert(reader.word() == "greeting");
assert(reader.word() == "\"world\"");
assert(reader.word() == ";");
assert(reader.word() == "");
}
{
StringReader reader("one # pretend this is a comment\n"
" two\n");
assert(reader.word() == "one");
assert(reader.word() == "#");
assert(reader.line() == "pretend this is a comment");
assert(reader.word() == "two");
assert(reader.word() == "");
}
{
// Quoted strings can span lines.
StringReader reader("x = \"\n \" ;\n");
assert(reader.word() == "x");
assert(reader.word() == "=");
assert(reader.word() == "\"\n \"");
assert(reader.word() == ";");
assert(reader.word() == "");
}
{
// Strings may contain backslash-escaped quote characters.
StringReader reader( "\"abc\\\"def\\\\\\\\\\\" \"\n");
assert(reader.word() == "\"abc\\\"def\\\\\\\\\\\" \"");
assert(reader.word() == "");
}
{
// A backslash-escaped backslash can be the last character in a string.
StringReader reader( "\"\\\\\" \n");
assert(reader.word() == "\"\\\\\"");
assert(reader.word() == "");
}
std::cout << "StringReader tests pass\n";
}

View File

@@ -1,52 +0,0 @@
#include "Symbol.h"
Symbol::Symbol() {
this->name = "UninitlizedSymbol";
this->terminal = false;
value = "NoValue";
}
Symbol::Symbol(std::string name, bool isTerminal) {
this->name = name;
this->terminal = isTerminal;
value = "NoValue";
}
Symbol::Symbol(std::string name, bool isTerminal, std::string value) {
this->name = name;
this->terminal = isTerminal;
this->value = value;
}
Symbol::~Symbol() {
}
const bool Symbol::operator==(const Symbol &other) const {
return( name == other.name && terminal == other.terminal);
}
const bool Symbol::operator!=(const Symbol &other) const {
return(!this->operator==(other));
}
const bool Symbol::operator<(const Symbol &other) const {
return name < other.getName();
}
std::string Symbol::getName() const {
return(name);
}
std::string Symbol::getValue() const {
return(value);
}
std::string Symbol::toString() const {
return(name + (terminal ? " " + value : ""));
}
bool Symbol::isTerminal() {
return terminal;
}

View File

@@ -1,388 +0,0 @@
#include "Table.h"
Table::Table() {
//
}
Table::~Table() {
//
}
void Table::exportTable(std::ofstream &file) {
//Save symbolIndexVec
int size = symbolIndexVec.size();
file.write((char*)&size, sizeof(int));
for (int i = 0; i < symbolIndexVec.size(); i++) {
//Save the name
std::string symbolName = symbolIndexVec[i].getName(); //Get the string
size = symbolName.size()+1;
file.write((char*)&size, sizeof(int)); //Save size of string
file.write((char*)(symbolName.c_str()), size); //Save the string
//Save the value
std::string symbolValue = symbolIndexVec[i].getValue(); //Get the string
size = symbolValue.size()+1;
file.write((char*)&size, sizeof(int)); //Save size of string
file.write((char*)(symbolValue.c_str()), size); //Save the string
bool isTerminal = symbolIndexVec[i].isTerminal();
file.write((char*)&isTerminal, sizeof(bool)); //Save the true false
}
//Save the actual table
size = table.size();
file.write((char*)&size, sizeof(int));
for (int i = 0; i < table.size(); i++) {
//each item is a middle vector
//std::vector< std::vector< std::vector<ParseAction*>* >* > table;
std::vector< std::vector<ParseAction*>* >* middleVector = table[i];
int middleVectorSize = middleVector->size();
file.write((char*)&middleVectorSize, sizeof(int));
for (int j = 0; j < middleVectorSize; j++) {
//each item is an inner vector
std::vector<ParseAction*>* innerVector = (*middleVector)[j];
int innerVectorSize = 0;
if (innerVector)
innerVectorSize = innerVector->size();
else
innerVectorSize = 0;
file.write((char*)&innerVectorSize, sizeof(int));
for (int k = 0; k < innerVectorSize; k++) {
//Save the type
ParseAction* toSave = (*innerVector)[k];
ParseAction::ActionType actionType = toSave->action;
file.write((char*)&actionType, sizeof(ParseAction::ActionType));
//Save the reduce rule if necessary
if (actionType == ParseAction::REDUCE) {
//Save the reduce rule
ParseRule* rule = toSave->reduceRule;
//int pointer index
int ptrIndx = rule->getIndex();
file.write((char*)&ptrIndx, sizeof(int));
//Symbol leftHandle
Symbol leftHandle = rule->getLeftSide();
//Save the name
std::string symbolName = leftHandle.getName(); //Get the string
size = symbolName.size()+1;
file.write((char*)&size, sizeof(int)); //Save size of string
file.write((char*)(symbolName.c_str()), size); //Save the string
//Save the value
std::string symbolValue = leftHandle.getValue(); //Get the string
size = symbolValue.size()+1;
file.write((char*)&size, sizeof(int)); //Save size of string
file.write((char*)(symbolValue.c_str()), size); //Save the string
bool isTerminal = leftHandle.isTerminal();
file.write((char*)&isTerminal, sizeof(bool)); //Save the true false
//std::vector<Symbol>* lookahead;
//Should not need
//std::vector<Symbol> rightSide;
std::vector<Symbol> rightSide = rule->getRightSide();
size = rightSide.size();
//std::cout << leftHandle.toString() << std::endl;
file.write((char*)&size, sizeof(int));
for (int l = 0; l < rightSide.size(); l++) {
//Save the name
symbolName = rightSide[l].getName(); //Get the string
size = symbolName.size()+1;
file.write((char*)&size, sizeof(int)); //Save size of string
file.write((char*)(symbolName.c_str()), size); //Save the string
//
//Save the value
symbolValue = rightSide[l].getValue(); //Get the string
size = symbolValue.size()+1;
file.write((char*)&size, sizeof(int)); //Save size of string
file.write((char*)(symbolValue.c_str()), size); //Save the string
//
isTerminal = rightSide[l].isTerminal();
file.write((char*)&isTerminal, sizeof(bool)); //Save the true false
}
}
int shiftState = toSave->shiftState;
file.write((char*)&shiftState, sizeof(int));
}
}
}
}
void Table::importTable(char* tableData) {
//Load symbolIndexVec
int size = *((int*)tableData);
tableData += sizeof(int);
for (int i = 0; i < size; i++) {
int stringLen = *((int*)tableData);
tableData += sizeof(int);
std::string symbolName = std::string(tableData);
tableData += stringLen*sizeof(char);
stringLen = *((int*)tableData);
tableData += sizeof(int);
std::string symbolValue = std::string(tableData);
tableData += stringLen*sizeof(char);
bool isTerminal = *((bool*)tableData);
tableData += sizeof(bool);
symbolIndexVec.push_back(Symbol(symbolName, isTerminal, symbolValue));
}
//Now for the actual table
int tableSize = *((int*)tableData);
tableData += sizeof(int);
for (int i = 0; i < tableSize; i++) {
//each item is a middle vector
std::vector< std::vector<ParseAction*>* >* middleVector = new std::vector< std::vector<ParseAction*>* >();
table.push_back(middleVector);
int middleVectorSize = *((int*)tableData);
tableData += sizeof(int);
for (int j = 0; j < middleVectorSize; j++) {
//each item is an inner vector
std::vector<ParseAction*>* innerVector = new std::vector<ParseAction*>();
middleVector->push_back(innerVector);
int innerVectorSize = *((int*)tableData);
tableData += sizeof(int);
for (int k = 0; k < innerVectorSize; k++) {
//each item is a ParseRule
ParseAction::ActionType action = *((ParseAction::ActionType*)tableData);
tableData += sizeof(ParseAction::ActionType);
//If reduce, import the reduce rule
ParseRule* reduceRule = NULL;
if (action == ParseAction::REDUCE) {
int ptrIndx = *((int*)tableData);
tableData += sizeof(int);
size = *((int*)tableData);
tableData += sizeof(int);
std::string leftHandleName = std::string(tableData);
tableData += size*sizeof(char);
size = *((int*)tableData);
tableData += sizeof(int);
std::string leftHandleValue = std::string(tableData);
tableData += size*sizeof(char);
bool isTerminal = *((bool*)tableData);
tableData += sizeof(bool);
//right side
std::vector<Symbol> rightSide;
size = *((int*)tableData);
tableData += sizeof(int);
for (int l = 0; l < size; l++) {
int inStringLen = *((int*)tableData);
tableData += sizeof(int);
std::string inSymbolName = std::string(tableData);
tableData += inStringLen*sizeof(char);
inStringLen = *((int*)tableData);
tableData += sizeof(int);
std::string inSymbolValue = std::string(tableData);
tableData += inStringLen*sizeof(char);
bool inIsTerminal = *((bool*)tableData);
tableData += sizeof(bool);
rightSide.push_back(Symbol(inSymbolName, inIsTerminal, inSymbolValue));
}
reduceRule = new ParseRule(Symbol(leftHandleName, isTerminal, leftHandleValue), ptrIndx, rightSide, std::vector<Symbol>());
}
int shiftState = *((int*)tableData);
tableData += sizeof(int);
//And push the new action back
if (reduceRule)
innerVector->push_back(new ParseAction(action, reduceRule));
else
innerVector->push_back(new ParseAction(action, shiftState));
}
}
}
}
void Table::setSymbols(Symbol EOFSymbol, Symbol nullSymbol) {
this->EOFSymbol = EOFSymbol;
this->nullSymbol = nullSymbol;
}
void Table::add(int stateNum, Symbol tranSymbol, ParseAction* action) {
//If this is the first time we're adding to the table, add the EOF character
if (symbolIndexVec.size() == 0)
symbolIndexVec.push_back(EOFSymbol);
//If state not in table, add up to and it.
//std::cout << "table size is " << table.size() <<std::endl;
while (stateNum >= table.size()) {
//std::cout << "Pushing back table" << std::endl;
table.push_back(new std::vector<std::vector< ParseAction*>* >());
}
//find out what index this symbol is on
int symbolIndex = -1;
for (std::vector<Symbol>::size_type i = 0; i < symbolIndexVec.size(); i++) {
if ( symbolIndexVec[i] == tranSymbol ) {
//Has been found
symbolIndex = i;
break;
}
}
//std::cout << "symbolIndex is " << symbolIndex << std::endl;
//If we've never done this symbol, add it
if (symbolIndex < 0) {
// std::cout << "pushing back symbolIndexVec" <<std::endl;
symbolIndex = symbolIndexVec.size();
symbolIndexVec.push_back(tranSymbol);
}
//std::cout << "symbolIndex is " << symbolIndex << " which is " << symbolIndexVec[symbolIndex]->toString() << std::endl;
//std::cout << table[stateNum] << " ";
while (symbolIndex >= table[stateNum]->size()) {
table[stateNum]->push_back(NULL);
}
//If this table slot is empty
//std::cout << "table[stateNum] is " << table[stateNum] << std::endl;
//std::cout << "blank is " << (*(table[stateNum]))[symbolIndex] << std::endl;
if ( (*(table[stateNum]))[symbolIndex] == NULL ) {
//std::cout << "Null, adding " << action->toString() << std::endl;
std::vector<ParseAction*>* actionList = new std::vector<ParseAction*>();
actionList->push_back(action);
(*(table[stateNum]))[symbolIndex] = actionList;
}
//If the slot is not empty and does not contain ourself, then it is a conflict
//else if ( !(*(table[stateNum]))[symbolIndex]->equalsExceptLookahead(*action)) {
else {
//std::cout << "not Null!" << std::endl;
//std::cout << "State: " << stateNum << " Conflict between old: " << (*(table[stateNum]))[symbolIndex]->toString() << " and new: " << action->toString() << " on " << tranSymbol->toString() << std::endl;
//Check to see if this action is already in the list
//(*(table[stateNum]))[symbolIndex]->push_back(action);
bool alreadyIn = false;
for (std::vector<ParseAction*>::size_type i = 0; i < (*(table[stateNum]))[symbolIndex]->size(); i++)
if (*((*((*(table[stateNum]))[symbolIndex]))[i]) == *action)
alreadyIn = true;
if (!alreadyIn)
(*(table[stateNum]))[symbolIndex]->push_back(action);
}
}
void Table::remove(int stateNum, Symbol tranSymbol) {
//find out what index this symbol is on
int symbolIndex = -1;
for (std::vector<Symbol>::size_type i = 0; i < symbolIndexVec.size(); i++) {
if ( symbolIndexVec[i] == tranSymbol ) {
//Has been found
symbolIndex = i;
break;
}
}
(*(table[stateNum]))[symbolIndex] = NULL;
}
std::vector<ParseAction*>* Table::get(int state, Symbol token) {
int symbolIndex = -1;
for (std::vector<Symbol>::size_type i = 0; i < symbolIndexVec.size(); i++) {
if ( symbolIndexVec[i] == token) {
symbolIndex = i;
break;
}
}
if (symbolIndex == -1) {
std::cout << "Unrecognized symbol: " << token.toString() << ", cannot get from table!" << std::endl;
return NULL;
}
//std::cout << "Get for state: " << state << ", and Symbol: " << token.toString() << std::endl;
if (state < 0 || state >= table.size()) {
std::cout << "State bad: " << state << std::endl;
return NULL;
}
std::vector<ParseAction*>* action = NULL;
if (symbolIndex < 0 || symbolIndex >= table[state]->size()) {
//std::cout << "Symbol bad for this state: " << token.toString() << ". This is a reject." << std::endl;
} else {
action = (*(table[state]))[symbolIndex];
}
//This is the accepting state, as it is the 1th's state's reduction on EOF, which is 0 in the symbolIndexVec
//(This assumes singular goal assignment, a simplification for now)
if (state == 1 && symbolIndex == 0) {
if (action == NULL)
action = new std::vector<ParseAction*>();
action->push_back(new ParseAction(ParseAction::ACCEPT));
}
//If outside the symbol range of this state (same as NULL), reject
if ( symbolIndex >= table[state]->size() ) {
action = new std::vector<ParseAction*>();
action->push_back(new ParseAction(ParseAction::REJECT));
}
//If null, reject. (this is a space with no other action)
if (action == NULL) {
action = new std::vector<ParseAction*>();
action->push_back(new ParseAction(ParseAction::REJECT));
}
//Otherwise, we have something, so return it
return action;
}
ParseAction* Table::getShift(int state, Symbol token) {
std::vector<ParseAction*>* actions = get(state, token);
ParseAction* shift = NULL;
for (int i = 0; i < actions->size(); i++) {
if ((*actions)[i]->action == ParseAction::SHIFT) {
shift = (*actions)[i];
break;
}
}
return shift;
}
std::vector<std::pair<std::string, ParseAction>> Table::stateAsParseActionVector(int state) {
std::vector<std::pair<std::string, ParseAction>> reconstructedState;
std::vector<std::vector<ParseAction*>*>* stateVec = table[state];
for (int i = 0; i < stateVec->size(); i++)
if (std::vector<ParseAction*>* forStateAndSymbol = (*stateVec)[i])
for (int j = 0; j < forStateAndSymbol->size(); j++)
reconstructedState.push_back(std::make_pair(symbolIndexVec[i].toString(),*((*forStateAndSymbol)[j])));
return reconstructedState;
}
std::string Table::toString() {
std::string concat = "";
for (std::vector<Symbol>::size_type i = 0; i < symbolIndexVec.size(); i++)
concat += "\t" + symbolIndexVec[i].toString();
concat += "\n";
for (std::vector< std::vector< std::vector< ParseRule* >* >* >::size_type i = 0; i < table.size(); i++) {
concat += intToString(i) + " is the state\t";
for (std::vector< std::vector< ParseRule* >* >::size_type j = 0; j < table[i]->size(); j++) {
concat += "for " + symbolIndexVec[j].toString() + " do ";
if ( (*(table[i]))[j] != NULL) {
for (std::vector< ParseRule* >::size_type k = 0; k < (*(table[i]))[j]->size(); k++) {
concat += (*((*(table[i]))[j]))[k]->toString() + "\t";
}
} else {
concat += "NULL\t";
}
}
concat += "\n";
}
return(concat);
}

View File

@@ -1,62 +0,0 @@
#include "Tester.h"
Tester::Tester(std::string krakenInvocation, std::string krakenGrammerLocation) : krakenInvocation(krakenInvocation), krakenGrammerLocation(krakenGrammerLocation) {
//initlization list
removeCmd = "rm -r";
resultsExtention = ".results";
expectedExtention = ".expected_results";
krakenExtention = ".krak";
changePermissions = "chmod 755";
shell = "sh";
cd = "cd";
redirect = ">";
sep = "/";
}
Tester::~Tester() {
//Nothing
}
void Tester::cleanExtras(std::string fileName) {
ssystem(removeCmd + " " + fileName);
}
bool Tester::run(std::string path) {
std::string fileName = split(path, *sep.c_str()).back();
std::cout << "Testing: " << fileName << " with " << krakenInvocation << " and " << krakenGrammerLocation << std::endl;
cleanExtras(path);
ssystem(krakenInvocation + " " + path + krakenExtention + " " + path);
// done automatically now
//ssystem(changePermissions + " " + path + sep + fileName + ".sh");
//ssystem(cd + " " + path + "; " + "./" + fileName + ".sh");
//ssystem(changePermissions + " " + path + sep + fileName);
ssystem(path + sep + fileName + " " + redirect + " " + path + sep + fileName + resultsExtention);
bool result = compareFiles(fileName + expectedExtention, path + sep + fileName + resultsExtention);
//If the test was succesful, we don't need all the extra files
if (result)
cleanExtras(path);
return result;
}
bool Tester::compareFiles(std::string file1Path, std::string file2Path) {
std::ifstream file1, file2;
file1.open(file1Path);
if (!file1.is_open()) {
std::cout << file1Path << " could not be opened!" << std::endl;
return false;
}
file2.open(file2Path);
if (!file2.is_open()) {
std::cout << file2Path << " could not be opened!" << std::endl;
return false;
}
std::string file1contents = readFile(file1);
std::string file2contents = readFile(file2);
return file1contents.compare(file2contents) == 0;
}

View File

@@ -1,268 +0,0 @@
#include "Type.h"
Type::Type() {
indirection = 0;
baseType = none;
typeDefinition = nullptr;
templateDefinition = nullptr;
returnType = nullptr;
templateInstantiated = false;
is_reference = false;
}
Type::Type(ValueType typeIn, int indirectionIn) {
indirection = indirectionIn;
baseType = typeIn;
typeDefinition = nullptr;
templateDefinition = nullptr;
returnType = nullptr;
templateInstantiated = false;
is_reference = false;
}
Type::Type(ValueType typeIn, std::set<std::string> traitsIn) {
indirection = 0;
baseType = typeIn;
traits = traitsIn;
typeDefinition = nullptr;
templateDefinition = nullptr;
returnType = nullptr;
templateInstantiated = false;
is_reference = false;
}
Type::Type(NodeTree<ASTData>* typeDefinitionIn, int indirectionIn) {
indirection = indirectionIn;
baseType = none;
typeDefinition = typeDefinitionIn;
templateDefinition = nullptr;
returnType = nullptr;
templateInstantiated = false;
is_reference = false;
}
Type::Type(NodeTree<ASTData>* typeDefinitionIn, std::set<std::string> traitsIn) {
indirection = 0;
baseType = none;
typeDefinition = typeDefinitionIn;
traits = traitsIn;
templateDefinition = nullptr;
returnType = nullptr;
templateInstantiated = false;
is_reference = false;
}
Type::Type(ValueType typeIn, NodeTree<ASTData>* typeDefinitionIn, int indirectionIn, bool referenceIn, std::set<std::string> traitsIn) {
baseType = typeIn;
indirection = indirectionIn;
typeDefinition = typeDefinitionIn;
traits = traitsIn;
templateDefinition = nullptr;
returnType = nullptr;
templateInstantiated = false;
is_reference = referenceIn;
}
Type::Type(ValueType typeIn, NodeTree<ASTData>* typeDefinitionIn, int indirectionIn, bool referenceIn, std::set<std::string> traitsIn, std::vector<Type*> parameterTypesIn, Type* returnTypeIn) {
baseType = typeIn;
indirection = indirectionIn;
typeDefinition = typeDefinitionIn;
traits = traitsIn;
templateDefinition = nullptr;
parameterTypes = parameterTypesIn;
returnType = returnTypeIn;
templateInstantiated = false;
is_reference = referenceIn;
}
Type::Type(std::vector<Type*> parameterTypesIn, Type* returnTypeIn, bool referenceIn) {
baseType = function_type;
indirection = 0;
typeDefinition = nullptr;
templateDefinition = nullptr;
parameterTypes = parameterTypesIn;
returnType = returnTypeIn;
templateInstantiated = false;
is_reference = referenceIn;
}
Type::Type(ValueType typeIn, NodeTree<Symbol>* templateDefinitionIn, std::set<std::string> traitsIn) {
indirection = 0;
baseType = typeIn;
typeDefinition = nullptr;
templateDefinition = templateDefinitionIn;
traits = traitsIn;
returnType = nullptr;
templateInstantiated = false;
is_reference = false;
}
Type::~Type() {
}
const bool Type::operator==(const Type &other) const {
return test_equality(other, true);
}
const bool Type::test_equality(const Type &other, bool care_about_references) const {
bool first_part = ( baseType == other.baseType && indirection == other.indirection && typeDefinition == other.typeDefinition && templateDefinition == other.templateDefinition && other.traits == traits);
if (care_about_references && is_reference != other.is_reference)
return false;
if (!first_part)
return false;
if ((returnType && !other.returnType) || (!returnType && other.returnType))
return false;
if (returnType && other.returnType)
if (*returnType != *other.returnType)
return false;
if (parameterTypes.size() != other.parameterTypes.size())
return false;
for (int i = 0; i < parameterTypes.size(); i++)
if (*parameterTypes[i] != *other.parameterTypes[i])
return false;
return true;
}
const bool Type::operator!=(const Type &other) const {
return(!this->operator==(other));
}
const bool Type::operator<(const Type &other) const {
if (baseType != other.baseType)
return baseType < other.baseType;
if (indirection != other.indirection)
return indirection < other.indirection;
if (is_reference != other.is_reference)
return is_reference;
if (typeDefinition != other.typeDefinition)
return typeDefinition < other.typeDefinition;
if (templateDefinition != other.templateDefinition)
return templateDefinition < other.templateDefinition;
if (traits != other.traits)
return traits < other.traits;
if ((returnType && !other.returnType) || (!returnType && other.returnType))
return returnType < other.returnType;
if (returnType && other.returnType)
if (*returnType != *other.returnType)
return *returnType < *other.returnType;
if (parameterTypes.size() != other.parameterTypes.size())
return parameterTypes.size() < other.parameterTypes.size();
for (int i = 0; i < parameterTypes.size(); i++)
if (*parameterTypes[i] != *other.parameterTypes[i])
return *parameterTypes[i] < *other.parameterTypes[i];
return false;
}
std::string Type::toString(bool showTraits) {
std::string typeString;
switch (baseType) {
case none:
if (typeDefinition)
typeString = typeDefinition->getDataRef()->symbol.getName();
else
typeString = "none";
break;
case template_type:
typeString = "template: " + templateDefinition->getDataRef()->toString();
break;
case template_type_type:
typeString = "template_type_type";
break;
case void_type:
typeString = "void";
break;
case boolean:
typeString = "bool";
break;
case integer:
typeString = "int";
break;
case floating:
typeString = "float";
break;
case double_percision:
typeString = "double";
break;
case character:
typeString = "char";
break;
case function_type:
typeString = "function(";
for (Type *param : parameterTypes)
typeString += param->toString();
typeString += "): " + returnType->toString();
break;
default:
if (typeDefinition)
typeString = typeDefinition->getDataRef()->symbol.getName();
else
typeString = "unknown_type";
}
if (is_reference)
typeString = "ref " + typeString;
for (int i = 0; i < indirection; i++)
typeString += "*";
if (indirection < 0)
typeString += "negative indirection: " + intToString(indirection);
if (traits.size() && showTraits) {
typeString += "[ ";
for (auto i : traits)
typeString += i + " ";
typeString += "]";
}
//std::cout << "Extra components of " << typeString << " are " << indirection << " " << typeDefinition << " " << templateDefinition << std::endl;
return typeString;
}
Type* Type::clone() {
return new Type(baseType, typeDefinition, indirection, is_reference, traits, parameterTypes, returnType);
}
int Type::getIndirection() {
return indirection;
}
void Type::setIndirection(int indirectionIn) {
indirection = indirectionIn;
}
void Type::increaseIndirection() {
setIndirection(indirection+1);
}
void Type::decreaseIndirection() {
setIndirection(indirection-1);
}
void Type::modifyIndirection(int mod) {
setIndirection(indirection + mod);
}
Type Type::withIncreasedIndirection() {
Type *newOne = clone();
newOne->increaseIndirection();
return *newOne;
}
Type Type::withReference() {
Type *newOne = clone();
newOne->is_reference = true;
return *newOne;
}
Type *Type::withReferencePtr() {
Type *newOne = clone();
newOne->is_reference = true;
return newOne;
}
Type *Type::withIncreasedIndirectionPtr() {
Type *newOne = clone();
newOne->increaseIndirection();
return newOne;
}
Type Type::withDecreasedIndirection() {
Type *newOne = clone();
newOne->decreaseIndirection();
return *newOne;
}
Type* Type::withoutReference() {
Type *newOne = clone();
newOne->is_reference = false;
return newOne;
}

View File

@@ -1,92 +0,0 @@
#include "util.h"
int ssystem(std::string command) {
return system(command.c_str());
}
std::string intToString(int theInt) {
std::stringstream converter;
converter << theInt;
return converter.str();
}
std::string replaceExEscape(std::string first, std::string search, std::string replace) {
size_t pos = 0;
while (pos <= first.size()-search.size()) {
pos = first.find(search, pos);
if (pos == std::string::npos)
break;
//std::cout << "Position is " << pos << " size of first is " << first.size() << " size of replace is " << replace.size() << std::endl;
//If excaped, don't worry about it.
if (pos > 0) {
int numBackslashes = 0;
int countBack = 1;
while ((int)pos-countBack >= 0 && first[pos-countBack] == '\\') {
numBackslashes++;
countBack++;
}
if (numBackslashes % 2 == 1) {
pos++;
continue;
}
}
first = first.replace(pos, search.size(), replace);
pos += replace.size();
}
return first;
}
//String slicing is crazy useful. substr isn't bad, but slicing with negative indicies is wonderful
std::string strSlice(std::string str, int begin, int end) {
if (begin < 0)
begin += str.length()+1;
if (end < 0)
end += str.length()+1;
return str.substr(begin, end-begin);
}
int findPerenEnd(std::string str, int i) {
int numHangingOpen = 0;
for (; i< str.length(); i++) {
if (str[i] == '(')
numHangingOpen++;
else if (str[i] == ')')
numHangingOpen--;
if (numHangingOpen == 0)
return i;
}
return -1;
}
std::vector<std::string> split(const std::string &str, char delim) {
std::stringstream ss(str);
std::string word;
std::vector<std::string> splitVec;
while (std::getline(ss, word, delim))
splitVec.push_back(word);
return splitVec;
}
std::string join(const std::vector<std::string> &strVec, std::string joinStr) {
if (strVec.size() == 0)
return "";
std::string joinedStr = strVec[0];
for (int i = 1; i < strVec.size(); i++)
joinedStr += joinStr + strVec[i];
return joinedStr;
}
std::string readFile(std::istream &file) {
std::string line, contents;
while(file.good()) {
getline(file, line);
contents.append(line+"\n");
}
return contents;
}
std::string padWithSpaces(std::string str, int padTo) {
while(str.length() < padTo)
str += " ";
return str;
}

9
doc/.gitignore vendored Normal file
View File

@@ -0,0 +1,9 @@
*.swp
*.zip
*.aux
*.bbl
*.blg
*.log
*.out
*.pdf

View File

@@ -1,345 +0,0 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Kraken Documentation
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%----------------------------------------------------------------------------------------
% PACKAGES AND DOCUMENT CONFIGURATIONS
%----------------------------------------------------------------------------------------
\documentclass{article}
\usepackage{graphicx} % Required for the inclusion of images
\usepackage{amsmath} % Required for some math elements
\renewcommand{\labelenumi}{\alph{enumi}.} % Make numbering in the enumerate environment by letter rather than number (e.g. section 6)
\usepackage{times}
\usepackage{listings}
\usepackage{color}
%----------------------------------------------------------------------------------------
% DOCUMENT INFORMATION
%----------------------------------------------------------------------------------------
\title{Kraken Programming Guide v0.0} % Title
\author{Jack \textsc{Sparrow}} % Author name
\date{\today} % Date for the report
\begin{document}
\maketitle % Insert the title, author and date
%----------------------------------------------------------------------------------------
% SECTION Compiling
%----------------------------------------------------------------------------------------
\section{Compiling}
Kraken compilation currently only supports building the compiler from source.
You can clone the repository from a terminal using:
\begin{lstlisting}
git clone https://github.com/Limvot/kraken.git
\end{lstlisting}
Once you have the repository, run the following commands:
\begin{lstlisting}
mkdir build %Create a build directory
cd build
cmake .. %Requires cmake to build the compiler
make %Create the compiler executable
\end{lstlisting}
This will create a kraken executable, which is how we will call the compiler.
Kraken supports several ways of invoking the compiler. These include:
\begin{lstlisting}
kraken source.krak
kraken source.krak outputExe
kraken grammarFile.kgm source.krak outputExe
\end{lstlisting}
The grammar file is a file specific to the compiler, and should be included
in the github repository. When you run the compile command, a new directory
with the name of the outputExe you specified will be created. In this directory
is a shell script, which will compile the created C file into a binary executable.
This binary exectuable can then be run as a normal C executable.
%----------------------------------------------------------------------------------------
% SECTION Variables
%----------------------------------------------------------------------------------------
\section{Variables}
\label{sec:var}
Kraken has automatic type deduction. This is sort of like the duck typing of
Python. The difference is that variables cannot change types. In this way, it
is much more like an implicit "auto" keyword in C++. Unlike C++, semicolons are
optional after declarations.
\subsection{Variable Declaration}
\begin{lstlisting}[language=C++]
var A: int; //A is unitialized int
var B = 1; //B is integer
var C = 2.0; //C is double
var D: double = 3.14 //D is double
\end{lstlisting}
\subsection{Primitive Types}
The primitive types found in kraken are:
\begin{enumerate}
\item int
\item float
\item double
\item char
\item bool
\item void
\end{enumerate}
%----------------------------------------------------------------------------------------
% SECTION 2: Functions
%----------------------------------------------------------------------------------------
\section{Functions}
\begin{lstlisting}[language=C++]
fun FunctionName(arg1 : arg1_type, arg2 : arg2_type) : returnType {
var result = arg1 + arg2;
return result;
}
\end{lstlisting}
Functions are declared using the {\bf{fun}} keyword. If you pass in
variables as shown, there will be passed by value, not by reference.
Therefore if you pass a variable in, it will not be modified outside the
function.
%----------------------------------------------------------------------------------------
% SECTION I/O
%----------------------------------------------------------------------------------------
\section{Input and Output}
In order to print to a terminal or file, the {\bf{io}} library must be
imported. There are a few different functions you can use to print to the
terminal.
The print() function will print out to the terminal without a newline
character. Like java, there is a println() function that will print whatever
you pass in, as well as a newline. There are also functions that can print
colors in a unix terminal. The color will continue when you print until
you call the function Reset().
\begin{enumerate}
\item {\color{red}{BoldRed()}}
\item {\color{green}{BoldGreen()}}
\item {\color{yellow}{BoldYellow()}}
\item {\color{blue}{BoldBlue()}}
\item {\color{magenta}{BoldMagneta()}}
\item {\color{cyan}{BoldCyan()}}
\end{enumerate}
\begin{lstlisting}[language=C++]
io::print(3.2); //print without a newline character
io::println(varA); //print variable A with a newline character
io::BoldRed();
io::println("This line is printed Red");
io::Reset();
io::println("This line is printed black");
\end{lstlisting}
You can also use kraken to read and write to files. The functions are as
follows:
\begin{lstlisting}[language=C++]
//returns true if file exists
var ifExists = io::file_exists("/usr/bin/clang");
//read file into string
var fileString = io::read_file("~/Documents/file.txt");
//write a string to the file
io::write_file("/",SteamString);
//read file into vector of chars
var charVec = io::read_file_binary("~/Documents/file2.txt");
//write a vector of chars to a file
io::write_file_binary("/",md5checkSum);
\end{lstlisting}
%----------------------------------------------------------------------------------------
% SECTION Memory Management
%----------------------------------------------------------------------------------------
\section{Memory Management}
\subsection{Pointers}
Pointers in kraken work like they do in C. The notation is the
{\bf{*}} symbol. This is a dereference operator. This means that it
operates on a pointer, and gives the variable pointed to. For
instance:
\begin{lstlisting}[language=C++]
var B: *int = 4; //B is a pointer to an integer 4
*B = 3; //B is now equal to 3
print(B); //would print an address, like "0xFFA3"
\end{lstlisting}
\subsection{References}
References are a way to create "automatic" pointers. If a function
takes in a reference, the variable is passed by reference, instead of by
value. This means that no copy of the variable is made, and any changes
made to the variable in the function will remain after the end of the
function. References also allow left-handed assignment. This means that
an array indexed on the left hand of an equal sign can have its value
changed.
\begin{lstlisting}[language=C++]
fun RefFunction(arg1: ref int): ref int{
return arg1 + 1;
}
var a = 6;
var b = RefFunction(a);
println(a); //a is now equal to 6
println(b); //b is now equal to 6
RefFunction(b) = 15;
println(b); //b is now equal to 15
\end{lstlisting}
\subsection{Dynamic Memory Allocation}
In order to allocate memory on the heap instead of the stack, dynamic
memory allocation must be used. The data must be explicitly allocated with
the {\bf{new}} keyword, and deleted with the {\bf{delete}} keyword. The
size in both instances must be provided.
\begin{lstlisting}[language=C++]
var data = new<int>(8); //Allocate 8 integers on the heap
delete(data,8); //Free the memory when its no longer used.
\end{lstlisting}
%----------------------------------------------------------------------------------------
% SECTION Classes
%----------------------------------------------------------------------------------------
\section{Classes}
\subsection{Constructors}
As with most of kraken, classes are based on their C++ counterparts, with
a few key differences. Constructors in kraken are not called by default.
You must actually call the constructor function. The constructor must return
a pointer to the object, which is denoted by the {\bf{this}} keyword.
The destructor is automatically called when the object goes out of scope,
and is just called destruct(). An example class is shown below:
\begin{lstlisting}[language=C++]
obj MyObject (Object) {
var variable1: int;
var variable2: vector::vector<double>;
fun construct(): *MyObject {
variable1 = 42;
variable2.construct();
return this;
}
//Could also pass by reference???
fun copy_construct(old: *MyObject): void {
variable1 = &old->variable1;
variable2.copy_construct(&old->variable2);
}
fun destruct() {
variable2.destruct();
}
}
\end{lstlisting}
\subsection{Operator Overloading}
Operator overloading allows you to use operators for syntactic sugar, and
make your code look nicer. This again borrow mostly from C++, and you can
overload most of the operators that you can in C++. An example is shown
below:
\begin{lstlisting}
//Inside a class
//overload the assignment operator
fun operator=(other: SampleObject): void{
destruct();
copy_construct(&other);
}
//overload the equality operator
fun operator==(other: SampleObject): bool{
return EqualTest == other.EqualTest;
}
\end{lstlisting}
\subsection{Traits}
Currently, Kraken has no notion of inheritance. Instead, objects can be
intialized with traits. These give special properties to the object. For
instance, if the object is defined with the {\bf{Object}} trait, then its
destructor will be called when the object goes out of scope. The second trait
that kraken has is the {\bf{Serializable}} trait. This allows it to be used
with the {\bf{serialize}} class, which serializes it into a vector of bytes.
This stream of bytes could then be used to send messages over TCP, etc.
\begin{lstlisting}
//Object has both Object and Serializable traits
obj Hermes (Object, Serializable) {
var RedBull: vector::vector<string>;
fun construct(): *Hermes {
RedBull = "gives you wings";
}
fun serialize(): vector::vector<char> {
//String already has a serialize member function
var toReturn = RedBull.serialize();
return toReturn;
}
fun unserialize(it: ref vector::vector<char>, pos: int): int {
pos = RedBull.unserialize(it,pos);
return pos;
}
fun destruct(): void {
io::println("I must return to my people");
}
\end{lstlisting}
%----------------------------------------------------------------------------------------
% SECTION Templates
%----------------------------------------------------------------------------------------
\section{Templates}
Templates are a very important part of C++, but are also one of the reasons
people do not like the language. They are extremely useful, but also fairly
hard to use properly. If you make an error while using templates, the bug is
often extremely difficult to find. Kraken templates aim to prevent that problem.
\\
Templates are a way of writing something once for any type. At compile time,
the compiler will see what types you are using with the template, and substitute
those types in. This is not duck typing, since the types cannot change once they
are assigned. It is more like how you can initialize variables in kraken, with
the use of {\bf{var}}. This is extremely useful for something like a container.
The vector class in kraken uses templates, so you can put any type, including
custom objects, into a vector. \\
The convention is to use {\bf{T}} for a template, and if there are two,
{\bf{U}}. The following example, taken from the vector implementation,
demonstrates templates.
\begin{lstlisting}[language=C++]
//Can have a vector of any type, with <T>
obj vector<T> (Object, Serializable) {
//data can be an array of any type
var data: *T;
//size and available are just primitive ints
var size: int;
var available: int;
...
}
\end{lstlisting}
%----------------------------------------------------------------------------------------
% SECTION Standard Library
%----------------------------------------------------------------------------------------
\section{Standard Library}
\subsection{Import Statements}
\subsection{Vector}
\subsection{String}
\subsection{Regex}
\subsection{Util}
\subsection{Data Structures}
\subsubsection{Stack}
\subsubsection{Queue}
\subsubsection{Set}
\subsubsection{Map}
%----------------------------------------------------------------------------------------
% SECTION Understanding Kraken Errors
%----------------------------------------------------------------------------------------
\section{Understanding Kraken Errors}
Section error
%----------------------------------------------------------------------------------------
% SECTION C Passthrough
%----------------------------------------------------------------------------------------
\section{C Passthrough}
\end{document}

11
doc/cited-paper.bib Normal file
View File

@@ -0,0 +1,11 @@
@phdthesis{shutt2010fexprs,
title={Fexprs as the basis of Lisp function application or \$vau: the ultimate abstraction},
author={Shutt, John N},
year={2010}
}
@article{kearsleyimplementing,
title={Implementing a Vau-based Language With Multiple Evaluation Strategies},
author={Kearsley, Logan}
}

2
doc/make_paper.sh Executable file
View File

@@ -0,0 +1,2 @@
#!/usr/bin/env bash
touch writeup.pdf && rm writeup.aux writeup.bbl writeup.blg writeup.log writeup.out writeup.pdf && pdflatex writeup && bibtex writeup && pdflatex writeup && bibtex writeup && pdflatex writeup && bibtex writeup && evince writeup.pdf

38
doc/nix/sources.json Normal file
View File

@@ -0,0 +1,38 @@
{
"niv": {
"branch": "master",
"description": "Easy dependency management for Nix projects",
"homepage": "https://github.com/nmattia/niv",
"owner": "nmattia",
"repo": "niv",
"rev": "65a61b147f307d24bfd0a5cd56ce7d7b7cc61d2e",
"sha256": "17mirpsx5wyw262fpsd6n6m47jcgw8k2bwcp1iwdnrlzy4dhcgqh",
"type": "tarball",
"url": "https://github.com/nmattia/niv/archive/65a61b147f307d24bfd0a5cd56ce7d7b7cc61d2e.tar.gz",
"url_template": "https://github.com/<owner>/<repo>/archive/<rev>.tar.gz"
},
"nixpkgs": {
"branch": "master",
"description": "Nix Packages collection",
"homepage": "",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "2b65a74aba274a06a673dfb6f28b96cbe0b032fb",
"sha256": "0f62z6q00dpxnf4c5ip8362kzzcmnlhx6fbia6dr97a21fzbc8aq",
"type": "tarball",
"url": "https://github.com/NixOS/nixpkgs/archive/2b65a74aba274a06a673dfb6f28b96cbe0b032fb.tar.gz",
"url_template": "https://github.com/<owner>/<repo>/archive/<rev>.tar.gz"
},
"nixpkgs-mozilla": {
"branch": "master",
"description": "mozilla related nixpkgs (extends nixos/nixpkgs repo)",
"homepage": "",
"owner": "mozilla",
"repo": "nixpkgs-mozilla",
"rev": "0510159186dd2ef46e5464484fbdf119393afa58",
"sha256": "1c6r5ldkh71v6acsfhni7f9sxvi7xrqzshcwd8w0hl2rrqyzi58w",
"type": "tarball",
"url": "https://github.com/mozilla/nixpkgs-mozilla/archive/0510159186dd2ef46e5464484fbdf119393afa58.tar.gz",
"url_template": "https://github.com/<owner>/<repo>/archive/<rev>.tar.gz"
}
}

174
doc/nix/sources.nix Normal file
View File

@@ -0,0 +1,174 @@
# This file has been generated by Niv.
let
#
# The fetchers. fetch_<type> fetches specs of type <type>.
#
fetch_file = pkgs: name: spec:
let
name' = sanitizeName name + "-src";
in
if spec.builtin or true then
builtins_fetchurl { inherit (spec) url sha256; name = name'; }
else
pkgs.fetchurl { inherit (spec) url sha256; name = name'; };
fetch_tarball = pkgs: name: spec:
let
name' = sanitizeName name + "-src";
in
if spec.builtin or true then
builtins_fetchTarball { name = name'; inherit (spec) url sha256; }
else
pkgs.fetchzip { name = name'; inherit (spec) url sha256; };
fetch_git = name: spec:
let
ref =
if spec ? ref then spec.ref else
if spec ? branch then "refs/heads/${spec.branch}" else
if spec ? tag then "refs/tags/${spec.tag}" else
abort "In git source '${name}': Please specify `ref`, `tag` or `branch`!";
in
builtins.fetchGit { url = spec.repo; inherit (spec) rev; inherit ref; };
fetch_local = spec: spec.path;
fetch_builtin-tarball = name: throw
''[${name}] The niv type "builtin-tarball" is deprecated. You should instead use `builtin = true`.
$ niv modify ${name} -a type=tarball -a builtin=true'';
fetch_builtin-url = name: throw
''[${name}] The niv type "builtin-url" will soon be deprecated. You should instead use `builtin = true`.
$ niv modify ${name} -a type=file -a builtin=true'';
#
# Various helpers
#
# https://github.com/NixOS/nixpkgs/pull/83241/files#diff-c6f540a4f3bfa4b0e8b6bafd4cd54e8bR695
sanitizeName = name:
(
concatMapStrings (s: if builtins.isList s then "-" else s)
(
builtins.split "[^[:alnum:]+._?=-]+"
((x: builtins.elemAt (builtins.match "\\.*(.*)" x) 0) name)
)
);
# The set of packages used when specs are fetched using non-builtins.
mkPkgs = sources: system:
let
sourcesNixpkgs =
import (builtins_fetchTarball { inherit (sources.nixpkgs) url sha256; }) { inherit system; };
hasNixpkgsPath = builtins.any (x: x.prefix == "nixpkgs") builtins.nixPath;
hasThisAsNixpkgsPath = <nixpkgs> == ./.;
in
if builtins.hasAttr "nixpkgs" sources
then sourcesNixpkgs
else if hasNixpkgsPath && ! hasThisAsNixpkgsPath then
import <nixpkgs> {}
else
abort
''
Please specify either <nixpkgs> (through -I or NIX_PATH=nixpkgs=...) or
add a package called "nixpkgs" to your sources.json.
'';
# The actual fetching function.
fetch = pkgs: name: spec:
if ! builtins.hasAttr "type" spec then
abort "ERROR: niv spec ${name} does not have a 'type' attribute"
else if spec.type == "file" then fetch_file pkgs name spec
else if spec.type == "tarball" then fetch_tarball pkgs name spec
else if spec.type == "git" then fetch_git name spec
else if spec.type == "local" then fetch_local spec
else if spec.type == "builtin-tarball" then fetch_builtin-tarball name
else if spec.type == "builtin-url" then fetch_builtin-url name
else
abort "ERROR: niv spec ${name} has unknown type ${builtins.toJSON spec.type}";
# If the environment variable NIV_OVERRIDE_${name} is set, then use
# the path directly as opposed to the fetched source.
replace = name: drv:
let
saneName = stringAsChars (c: if isNull (builtins.match "[a-zA-Z0-9]" c) then "_" else c) name;
ersatz = builtins.getEnv "NIV_OVERRIDE_${saneName}";
in
if ersatz == "" then drv else
# this turns the string into an actual Nix path (for both absolute and
# relative paths)
if builtins.substring 0 1 ersatz == "/" then /. + ersatz else /. + builtins.getEnv "PWD" + "/${ersatz}";
# Ports of functions for older nix versions
# a Nix version of mapAttrs if the built-in doesn't exist
mapAttrs = builtins.mapAttrs or (
f: set: with builtins;
listToAttrs (map (attr: { name = attr; value = f attr set.${attr}; }) (attrNames set))
);
# https://github.com/NixOS/nixpkgs/blob/0258808f5744ca980b9a1f24fe0b1e6f0fecee9c/lib/lists.nix#L295
range = first: last: if first > last then [] else builtins.genList (n: first + n) (last - first + 1);
# https://github.com/NixOS/nixpkgs/blob/0258808f5744ca980b9a1f24fe0b1e6f0fecee9c/lib/strings.nix#L257
stringToCharacters = s: map (p: builtins.substring p 1 s) (range 0 (builtins.stringLength s - 1));
# https://github.com/NixOS/nixpkgs/blob/0258808f5744ca980b9a1f24fe0b1e6f0fecee9c/lib/strings.nix#L269
stringAsChars = f: s: concatStrings (map f (stringToCharacters s));
concatMapStrings = f: list: concatStrings (map f list);
concatStrings = builtins.concatStringsSep "";
# https://github.com/NixOS/nixpkgs/blob/8a9f58a375c401b96da862d969f66429def1d118/lib/attrsets.nix#L331
optionalAttrs = cond: as: if cond then as else {};
# fetchTarball version that is compatible between all the versions of Nix
builtins_fetchTarball = { url, name ? null, sha256 }@attrs:
let
inherit (builtins) lessThan nixVersion fetchTarball;
in
if lessThan nixVersion "1.12" then
fetchTarball ({ inherit url; } // (optionalAttrs (!isNull name) { inherit name; }))
else
fetchTarball attrs;
# fetchurl version that is compatible between all the versions of Nix
builtins_fetchurl = { url, name ? null, sha256 }@attrs:
let
inherit (builtins) lessThan nixVersion fetchurl;
in
if lessThan nixVersion "1.12" then
fetchurl ({ inherit url; } // (optionalAttrs (!isNull name) { inherit name; }))
else
fetchurl attrs;
# Create the final "sources" from the config
mkSources = config:
mapAttrs (
name: spec:
if builtins.hasAttr "outPath" spec
then abort
"The values in sources.json should not have an 'outPath' attribute"
else
spec // { outPath = replace name (fetch config.pkgs name spec); }
) config.sources;
# The "config" used by the fetchers
mkConfig =
{ sourcesFile ? if builtins.pathExists ./sources.json then ./sources.json else null
, sources ? if isNull sourcesFile then {} else builtins.fromJSON (builtins.readFile sourcesFile)
, system ? builtins.currentSystem
, pkgs ? mkPkgs sources system
}: rec {
# The sources, i.e. the attribute set of spec name to spec
inherit sources;
# The "pkgs" (evaluated nixpkgs) to use for e.g. non-builtin fetchers
inherit pkgs;
};
in
mkSources (mkConfig {}) // { __functor = _: settings: mkSources (mkConfig settings); }

11
doc/shell.nix Normal file
View File

@@ -0,0 +1,11 @@
let
sources = import ./nix/sources.nix;
pkgs = import sources.nixpkgs { };
in
pkgs.mkShell {
buildInputs = with pkgs; [
texlive.combined.scheme-full
evince
];
}

269
doc/writeup.tex Normal file
View File

@@ -0,0 +1,269 @@
%%
%% This is file `sample-acmsmall.tex',
%% generated with the docstrip utility.
%%
%% The original source files were:
%%
%% samples.dtx (with options: `acmsmall')
%%
%% IMPORTANT NOTICE:
%%
%% For the copyright see the source file.
%%
%% Any modified versions of this file must be renamed
%% with new filenames distinct from sample-acmsmall.tex.
%%
%% For distribution of the original source see the terms
%% for copying and modification in the file samples.dtx.
%%
%% This generated file may be distributed as long as the
%% original source files, as listed above, are part of the
%% same distribution. (The sources need not necessarily be
%% in the same archive or directory.)
%%
%%
%% Commands for TeXCount
%TC:macro \cite [option:text,text]
%TC:macro \citep [option:text,text]
%TC:macro \citet [option:text,text]
%TC:envir table 0 1
%TC:envir table* 0 1
%TC:envir tabular [ignore] word
%TC:envir displaymath 0 word
%TC:envir math 0 word
%TC:envir comment 0 0
%%
%%
%% The first command in your LaTeX source must be the \documentclass command.
\documentclass[acmsmall]{acmart}
%%
%% \BibTeX command to typeset BibTeX logo in the docs
\AtBeginDocument{%
\providecommand\BibTeX{{%
\normalfont B\kern-0.5em{\scshape i\kern-0.25em b}\kern-0.8em\TeX}}}
%% Rights management information. This information is sent to you
%% when you complete the rights form. These commands have SAMPLE
%% values in them; it is your responsibility as an author to replace
%% the commands and values with those provided to you when you
%% complete the rights form.
\setcopyright{acmcopyright}
\copyrightyear{2022}
\acmYear{2022}
\acmDOI{10.1145/1122445.1122456}
%%
%% These commands are for a JOURNAL article.
\acmJournal{JACM}
\acmVolume{37}
\acmNumber{4}
\acmArticle{111}
\acmMonth{8}
%%
%% Submission ID.
%% Use this when submitting an article to a sponsored event. You'll
%% receive a unique submission ID from the organizers
%% of the event, and this ID should be used as the parameter to this command.
\acmSubmissionID{123-A56-BU3}
%%
%% The majority of ACM publications use numbered citations and
%% references. The command \citestyle{authoryear} switches to the
%% "author year" style.
%%
%% If you are preparing content for an event
%% sponsored by ACM SIGGRAPH, you must use the "author year" style of
%% citations and references.
%% Uncommenting
%% the next command will enable that style.
%%\citestyle{acmauthoryear}
%%
%% end of the preamble, start of the body of the document source.
\begin{document}
%%
%% The "title" command has an optional parameter,
%% allowing the author to define a "short title" to be used in page headers.
\title{Efficient compilation of a functional Lisp based on Vau calculus}
%%
%% The "author" command and its associated commands are used to define
%% the authors and their affiliations.
%% Of note is the shared affiliation of the first two authors, and the
%% "authornote" and "authornotemark" commands
%% used to denote shared contribution to the research.
\author{Nathan Braswell}
\email{nathan.braswell@gtri.@gatech.edu}
%%\orcid{1234-5678-9012}
%%\author{G.K.M. Tobin}
%%\authornotemark[1]
%%\email{webmaster@marysville-ohio.com}
\affiliation{%
\institution{Georgia Tech}
%%\streetaddress{P.O. Box 1212}
\city{Atlanta}
\state{GA}
\country{USA}
%%\postcode{43017-6221}
}
%%\author{Lars Th{\o}rv{\"a}ld}
%%\affiliation{%
%% \institution{The Th{\o}rv{\"a}ld Group}
%% \streetaddress{1 Th{\o}rv{\"a}ld Circle}
%% \city{Hekla}
%% \country{Iceland}}
%%\email{larst@affiliation.org}
%%\author{Valerie B\'eranger}
%%\affiliation{%
%% \institution{Inria Paris-Rocquencourt}
%% \city{Rocquencourt}
%% \country{France}
%%}
%%\author{Aparna Patel}
%%\affiliation{%
%% \institution{Rajiv Gandhi University}
%% \streetaddress{Rono-Hills}
%% \city{Doimukh}
%% \state{Arunachal Pradesh}
%% \country{India}}
%%\author{Huifen Chan}
%%\affiliation{%
%% \institution{Tsinghua University}
%% \streetaddress{30 Shuangqing Rd}
%% \city{Haidian Qu}
%% \state{Beijing Shi}
%% \country{China}}
%%\author{Charles Palmer}
%%\affiliation{%
%% \institution{Palmer Research Laboratories}
%% \streetaddress{8600 Datapoint Drive}
%% \city{San Antonio}
%% \state{Texas}
%% \country{USA}
%% \postcode{78229}}
%%\email{cpalmer@prl.com}
%%\author{John Smith}
%%\affiliation{%
%% \institution{The Th{\o}rv{\"a}ld Group}
%% \streetaddress{1 Th{\o}rv{\"a}ld Circle}
%% \city{Hekla}
%% \country{Iceland}}
%%\email{jsmith@affiliation.org}
%%\author{Julius P. Kumquat}
%%\affiliation{%
%% \institution{The Kumquat Consortium}
%% \city{New York}
%% \country{USA}}
%%\email{jpkumquat@consortium.net}
%%
%% By default, the full list of authors will be used in the page
%% headers. Often, this list is too long, and will overlap
%% other information printed in the page headers. This command allows
%% the author to define a more concise list
%% of authors' names for this purpose.
%%\renewcommand{\shortauthors}{Trovato and Tobin, et al.}
%%
%% The abstract is a short summary of the work to be presented in the
%% article.
\begin{abstract}
Naively executing a language using Vau and Fexprs instead of macros
is slow.
\end{abstract}
%%
%% The code below is generated by the tool at http://dl.acm.org/ccs.cfm.
%% Please copy and paste the code instead of the example below.
%%
%%\begin{CCSXML}
%%<ccs2012>
%% <concept>
%% <concept_id>10010520.10010553.10010562</concept_id>
%% <concept_desc>Computer systems organization~Embedded systems</concept_desc>
%% <concept_significance>500</concept_significance>
%% </concept>
%% <concept>
%% <concept_id>10010520.10010575.10010755</concept_id>
%% <concept_desc>Computer systems organization~Redundancy</concept_desc>
%% <concept_significance>300</concept_significance>
%% </concept>
%% <concept>
%% <concept_id>10010520.10010553.10010554</concept_id>
%% <concept_desc>Computer systems organization~Robotics</concept_desc>
%% <concept_significance>100</concept_significance>
%% </concept>
%% <concept>
%% <concept_id>10003033.10003083.10003095</concept_id>
%% <concept_desc>Networks~Network reliability</concept_desc>
%% <concept_significance>100</concept_significance>
%% </concept>
%%</ccs2012>
%%\end{CCSXML}
%%\ccsdesc[500]{Computer systems organization~Embedded systems}
%%\ccsdesc[300]{Computer systems organization~Redundancy}
%%\ccsdesc{Computer systems organization~Robotics}
%%\ccsdesc[100]{Networks~Network reliability}
%%
%% Keywords. The author(s) should pick words that accurately describe
%% the work being presented. Separate the keywords with commas.
\keywords{partial evaluation, vau, fexprs, WebAssembly}
%%
%% This command processes the author and affiliation and title
%% information and builds the first part of the formatted document.
%%\maketitle
\section{Introduction and Motivation}
Vaus \cite{shutt2010fexprs} (at \url{https://web.wpi.edu/Pubs/ETD/Available/etd-090110-124904/unrestricted/jshutt.pdf})
All code available at \url{https://github.com/limvot/kraken}
\section{Prior Work}
\begin{itemize}
\item{} Axis of Eval rundown of attempted implmentations - \url{https://axisofeval.blogspot.com/2011/09/kernel-underground.html} \\
\item{} Lambda The Ultimate small discussion of partial eval for Vau/Kernel - \url{http://lambda-the-ultimate.org/node/4346} \\
\item{} Implementing a Vau-based Language With Multiple Evaluation Strategies - \cite{kearsleyimplementing} \\
Talks about how partial evaluation could make efficient, doesn't do it.
\item{} Google Groups email thread by Andres Navarro - \url{https://groups.google.com/g/klisp/c/Dva-Le8Hr-g/m/pyl1Ufu-vksJ} \\
Andres Navarro talks about his experimental fklisp which is a "very simple functional dialect of Kernel" with no mutation or first class continuations.
It doesn't compile anything, but prints out the partially evalauted expression. Was a work in progress, ran into performance problems, seems abandond.
\end{itemize}
\subsection{Issues}
Slow.
\section{Solution}
Purely functional.
Tricky partial evaluation.
\begin{verbatim}
((wrap (vau (let1)
(let1 lambda (vau se (p b1) (wrap (eval (array vau p b1) se)))
(lambda (n) (* n 2))
)
; impl of let1
)) (vau de (s v b) (eval (array (array vau (array s) b) (eval v de)) de)))
\end{verbatim}
\bibliographystyle{ACM-Reference-Format}
\bibliography{cited-paper}
\end{document}
\endinput
%%
%% End of file `sample-acmsmall.tex'.

17
fib.c
View File

@@ -1,17 +0,0 @@
#include <stdio.h>
int fib(int n) {
if (n == 0) {
return 0;
} else if (n == 1) {
return 1;
} else {
return fib(n-1) + fib(n-2);
}
}
int main(int argc, char** argv) {
int n = 27;
printf("Fib(%d): %d\n", n, fib(n));
return 0;
}

View File

@@ -1,440 +0,0 @@
import vec:*
import vec_literals:*
import map:*
import set:*
import hash_set
import util:*
import str:*
import regex:*
// nonterminals are negative, terminals are positive
obj Grammer<T,K> (Object) {
var nonterminals: vec<vec<vec<int>>>
var nonterminal_names: vec<str>
var nonterminal_funs: vec<vec<pair<K,fun(ref K, ref vec<T>): T>>>
var terminals: vec<regex>
var terminal_funs: vec<pair<K,fun(ref K,ref str,int,int): T>>
var start_symbol: int
fun construct(): *Grammer {
nonterminals.construct()
nonterminal_names.construct()
nonterminal_funs.construct()
terminals.construct()
terminal_funs.construct()
start_symbol = 0
return this
}
fun copy_construct(old: *Grammer): void {
nonterminals.copy_construct(&old->nonterminals)
nonterminal_names.copy_construct(&old->nonterminal_names)
nonterminal_funs.copy_construct(&old->nonterminal_funs)
terminals.copy_construct(&old->terminals)
terminal_funs.copy_construct(&old->terminal_funs)
start_symbol = old->start_symbol
}
fun destruct(): void {
nonterminals.destruct()
nonterminal_names.destruct()
nonterminal_funs.destruct()
terminals.destruct()
terminal_funs.destruct()
}
fun operator=(other:ref Grammer):void {
destruct()
copy_construct(&other)
}
fun add_new_nonterminal(name: *char, rule: ref vec<int>, d: K, f: fun(ref K,ref vec<T>): T): int {
return add_new_nonterminal(str(name), rule, d, f)
}
fun add_new_nonterminal(name: ref str, rule: ref vec<int>, d: K, f: fun(ref K,ref vec<T>): T): int {
nonterminals.add(vec(rule))
nonterminal_names.add(name)
nonterminal_funs.add(vec(make_pair(d,f)))
return -1*nonterminals.size
}
fun add_to_or_create_nonterminal(name: ref str, rule: ref vec<int>, d: K, f: fun(ref K,ref vec<T>): T): int {
var idx = nonterminal_names.find(name)
if idx >= 0 {
add_to_nonterminal(-1*(idx+1), rule, d, f)
return -1*(idx+1)
} else {
return add_new_nonterminal(name, rule, d, f)
}
}
fun add_to_nonterminal(nonterminal: int, rule: ref vec<int>, d: K, f: fun(ref K,ref vec<T>): T) {
nonterminals[(-1*nonterminal)-1].add(rule)
nonterminal_funs[(-1*nonterminal)-1].add(make_pair(d,f))
}
fun add_terminal(c: *char, d: K, f: fun(ref K,ref str,int,int): T): int {
return add_terminal(str(c), d, f)
}
fun add_terminal(c: ref str, d: K, f: fun(ref K,ref str,int,int): T): int {
terminals.add(regex(c))
terminal_funs.add(make_pair(d,f))
return terminals.size
}
fun get_nonterminal_rules(nonterminal: int): ref vec<vec<int>> {
return nonterminals[(-1*nonterminal)-1]
}
fun match_terminal(terminal: int, input: ref str, start: int): int {
return terminals[terminal-1].long_match(input.getBackingMemory(), start, input.length())
}
fun is_terminal(x: int): bool {
return x > 0
}
fun set_start_symbol(x: int) {
start_symbol = x
}
fun to_string(it: BS): str {
var rule_str = str()
for (var i = 0; i < nonterminals[(-1*it.nonterminal)-1][it.rule_idx].size; i++;) {
if i == it.idx_into_rule {
rule_str += "*"
}
var erminal = nonterminals[(-1*it.nonterminal)-1][it.rule_idx][i]
rule_str += to_string(erminal)
if i < nonterminals[(-1*it.nonterminal)-1][it.rule_idx].size-1 {
rule_str += " "
}
}
if it.idx_into_rule == nonterminals[(-1*it.nonterminal)-1][it.rule_idx].size {
rule_str += "*"
}
return str("<") + nonterminal_names[(-1*it.nonterminal)-1] + " ::= " + rule_str + ", " + it.left + ", " + it.pivot + ", " + it.right + ">"
}
fun to_string(erminal: int): str {
if erminal < 0 {
return nonterminal_names[(-1*erminal)-1]
} else {
return terminals[erminal-1].regexString
}
}
fun to_string(): str {
var to_ret = str()
for (var i = 0; i < nonterminals.size; i++;) {
for (var j = 0; j < nonterminals[i].size; j++;) {
to_ret += nonterminal_names[i] + " ::="
for (var k = 0; k < nonterminals[i][j].size; k++;) {
to_ret += " " + to_string(nonterminals[i][j][k])
}
to_ret += "\n"
}
}
return "start_symbol: " + to_string(start_symbol) + "\n" + to_ret
}
fun eval_BSR(input: ref str, BSR: ref set<BS>): T {
var top = -1
for (var i = 0; i < BSR.data.size; i++;) {
if BSR.data[i].nonterminal == start_symbol && BSR.data[i].idx_into_rule == nonterminals[(-1*BSR.data[i].nonterminal)-1][BSR.data[i].rule_idx].size && BSR.data[i].left == 0 && BSR.data[i].right == input.length() {
top = i
break
}
}
if top == -1 {
println("Could not find top for input:")
println(input)
println(str("of length ") + input.length())
for (var i = 0; i < BSR.data.size; i++;) {
println(str() + i + ": " + to_string(BSR.data[i]))
}
error("Could not find top")
}
return eval_BSR(input, BSR, top)
}
fun eval_BSR(input: ref str, BSR: ref set<BS>, c: int): T {
var bs = BSR.data[c]
var nonterminal = (-1*bs.nonterminal)-1
if bs.idx_into_rule != nonterminals[nonterminal][bs.rule_idx].size {
error("Evaluating BSR from not the end!")
}
var params = vec<T>()
for (var i = bs.idx_into_rule-1; i >= 0; i--;) {
var erminal = nonterminals[nonterminal][bs.rule_idx][i]
if is_terminal(erminal) {
var right_value = terminal_funs[erminal-1].second(terminal_funs[erminal-1].first, input, bs.pivot, bs.right)
params.add(right_value)
} else {
/*var right = find_comp(erminal, bs.pivot, bs.right)*/
var right = -1
var sub_nonterminal_idx = (-1*erminal)-1
for (var j = 0; j < BSR.data.size; j++;) {
if BSR.data[j].nonterminal == erminal && BSR.data[j].idx_into_rule == nonterminals[sub_nonterminal_idx][BSR.data[j].rule_idx].size && BSR.data[j].left == bs.pivot && BSR.data[j].right == bs.right {
right = j
break
}
}
var right_value = eval_BSR(input, BSR, right)
params.add(right_value)
}
// get the new left bs
if i != 0 {
/*var new_bs_idx = find_mid(bs.nonterminal, bs.rule_idx, i, bs.left, bs.pivot)*/
var new_bs_idx = -1
for (var j = 0; j < BSR.data.size; j++;) {
if BSR.data[j].nonterminal == bs.nonterminal && BSR.data[j].rule_idx == bs.rule_idx && BSR.data[j].idx_into_rule == i && BSR.data[j].left == bs.left && BSR.data[j].right == bs.pivot {
new_bs_idx = j
break
}
}
bs = BSR.data[new_bs_idx]
}
}
var to_ret = nonterminal_funs[nonterminal][bs.rule_idx].second(nonterminal_funs[nonterminal][bs.rule_idx].first, params.reverse())
return to_ret
}
}
obj Pending (Object) {
var nonterminal: int
var rule_idx: int
var idx_into_rule: int
var left: int
fun construct(): *Pending {
return this->construct(0,0,0,0)
}
fun construct(nonterminal: int, rule_idx: int, idx_into_rule: int, left: int): *Pending {
this->nonterminal = nonterminal;
this->rule_idx = rule_idx;
this->idx_into_rule = idx_into_rule;
this->left = left;
return this
}
fun copy_construct(old: *Pending): void {
this->nonterminal = old->nonterminal;
this->rule_idx = old->rule_idx;
this->idx_into_rule = old->idx_into_rule;
this->left = old->left;
}
fun destruct(): void { }
fun operator=(other:ref Pending):void {
destruct()
copy_construct(&other)
}
fun operator==(rhs: ref Pending): bool {
return nonterminal == rhs.nonterminal && rule_idx == rhs.rule_idx && idx_into_rule == rhs.idx_into_rule && left == rhs.left
}
}
fun pending(nonterminal: int, rule_idx: int, idx_into_rule: int, left: int): Pending {
var to_ret.construct(nonterminal, rule_idx, idx_into_rule, left): Pending
return to_ret
}
obj Descriptor (Object, Hashable) {
var nonterminal: int
var rule_idx: int
var idx_into_rule: int
var left: int
var pivot: int
fun construct(): *Descriptor {
return this->construct(0,0,0,0,0)
}
fun construct(nonterminal: int, rule_idx: int, idx_into_rule: int, left: int, pivot: int): *Descriptor {
this->nonterminal = nonterminal;
this->rule_idx = rule_idx;
this->idx_into_rule = idx_into_rule;
this->left = left;
this->pivot = pivot;
return this
}
fun copy_construct(old: *Descriptor): void {
this->nonterminal = old->nonterminal;
this->rule_idx = old->rule_idx;
this->idx_into_rule = old->idx_into_rule;
this->left = old->left;
this->pivot = old->pivot;
}
fun destruct(): void { }
fun operator=(other:ref Descriptor):void {
destruct()
copy_construct(&other)
}
fun operator==(rhs: ref Descriptor): bool {
return nonterminal == rhs.nonterminal && rule_idx == rhs.rule_idx && idx_into_rule == rhs.idx_into_rule && left == rhs.left && pivot == rhs.pivot
}
fun hash():ulong {
//return hash(nonterminal) ^ hash(rule_idx) ^ hash(idx_into_rule) ^ hash(left) ^ hash(pivot)
return nonterminal*3 + rule_idx*5 + idx_into_rule*7 + left*11 + pivot*13
}
}
fun descriptor(nonterminal: int, rule_idx: int, idx_into_rule: int, left: int, pivot: int): Descriptor {
var to_ret.construct(nonterminal, rule_idx, idx_into_rule, left, pivot): Descriptor
return to_ret
}
obj BS (Object) {
var nonterminal: int
var rule_idx: int
var idx_into_rule: int
var left: int
var pivot: int
var right: int
fun construct(): *BS {
return this->construct(0,0,0,0,0,0)
}
fun construct(nonterminal: int, rule_idx: int, idx_into_rule: int, left: int, pivot: int, right: int): *BS {
this->nonterminal = nonterminal;
this->rule_idx = rule_idx;
this->idx_into_rule = idx_into_rule;
this->left = left;
this->pivot = pivot;
this->right = right;
return this
}
fun copy_construct(old: *BS): void {
this->nonterminal = old->nonterminal;
this->rule_idx = old->rule_idx;
this->idx_into_rule = old->idx_into_rule;
this->left = old->left;
this->pivot = old->pivot;
this->right = old->right;
}
fun destruct(): void { }
fun operator=(other:ref BS):void {
destruct()
copy_construct(&other)
}
fun to_string(): str {
return str("nonterminal:") + nonterminal + " rule_idx:" + rule_idx + " idx_into_rule:" + idx_into_rule + " l:" + left + " p:" + pivot + " r:" + right
}
fun operator==(rhs: ref BS): bool {
return nonterminal == rhs.nonterminal && rule_idx == rhs.rule_idx && idx_into_rule == rhs.idx_into_rule && left == rhs.left && pivot == rhs.pivot && right == rhs.right
}
}
fun bs(nonterminal: int, rule_idx: int, idx_into_rule: int, left: int, pivot: int, right: int): BS {
var to_ret.construct(nonterminal, rule_idx, idx_into_rule, left, pivot, right): BS
return to_ret
}
/*fun fungll<T,K>(grammar: ref Grammer<T,K>, start_symbol: *char, input: ref str): set<BS> {*/
/*return fungll(grammar, str(start_symbol), input)*/
/*}*/
/*fun fungll<T,K>(grammar: ref Grammer<T,K>, start_symbol: str, input: ref str): set<BS> {*/
/*return fungll(grammar, -1*(grammar.nonterminal_funs.find(start_symbol)+1), input)*/
/*}*/
fun fungll<T,K>(grammar: ref Grammer<T,K>, start_symbol: int, input: ref str): set<BS> {
var R = descend(grammar, start_symbol, 0)
var G = map<pair<int, int>, set<Pending>>()
var P = map<pair<int,int>, set<int>>()
var Y = set<BS>()
R.chaotic_closure(fun(d: Descriptor): set<Descriptor> {
var it = process(grammar, input, d, G, P)
//var Yp = it.first.second
Y += it.first.second
var Gp = it.second
var Pp = it.third
for (var i = 0; i < Gp.keys.size; i++;) {
if G.contains_key(Gp.keys[i]) {
G[Gp.keys[i]].add(Gp.values[i])
} else {
G[Gp.keys[i]] = Gp.values[i]
}
}
for (var i = 0; i < Pp.keys.size; i++;) {
if P.contains_key(Pp.keys[i]) {
P[Pp.keys[i]].add(Pp.values[i])
} else {
P[Pp.keys[i]] = Pp.values[i]
}
}
// Rp
return it.first.first
})
return Y
}
fun descend<T,K>(grammar: ref Grammer<T,K>, symbol: int, l: int): set<Descriptor> {
var to_ret = set<Descriptor>()
for (var rhs = 0; rhs < grammar.get_nonterminal_rules(symbol).size; rhs++;)
to_ret.add(descriptor(symbol, rhs, 0, l, l))
return to_ret
}
fun process<T,K>(grammar: ref Grammer<T,K>, input: ref str, descript: Descriptor, G: ref map<pair<int, int>, set<Pending>>, P: ref map<pair<int,int>, set<int>>): triple<pair<set<Descriptor>, set<BS>>, map<pair<int, int>, set<Pending>>, map<pair<int,int>, set<int>>> {
// if at end / end is emptystr
if descript.idx_into_rule == grammar.get_nonterminal_rules(descript.nonterminal)[descript.rule_idx].size {
return process_e(grammar, descript, G, P)
} else {
return process_symbol(grammar, input, descript, G, P)
}
}
fun process_e<T,K>(grammar: ref Grammer<T,K>, descript: Descriptor, G: ref map<pair<int, int>, set<Pending>>, P: ref map<pair<int,int>, set<int>>): triple<pair<set<Descriptor>, set<BS>>, map<pair<int, int>, set<Pending>>, map<pair<int,int>, set<int>>> {
var nonterminal: int
var rule_idx: int
var left: int
var pivot: int
var X = descript.nonterminal
var l = descript.left;
var k = descript.pivot;
var K = G.get_with_default(make_pair(X,l), set<Pending>())
var it = ascend(l,K,k)
var R = it.first
var Y = it.second
if grammar.get_nonterminal_rules(X)[descript.rule_idx].size == 0 {
Y.add(bs(X,descript.rule_idx, 0, l, l, l))
}
return make_triple(make_pair(R,Y), map<pair<int, int>, set<Pending>>(), map(make_pair(X,l), set(k)))
}
fun process_symbol<T,K>(grammar: ref Grammer<T,K>, input: ref str, descript: Descriptor, G: ref map<pair<int, int>, set<Pending>>, P: ref map<pair<int,int>, set<int>>): triple<pair<set<Descriptor>, set<BS>>, map<pair<int, int>, set<Pending>>, map<pair<int,int>, set<int>>> {
var s = grammar.get_nonterminal_rules(descript.nonterminal)[descript.rule_idx][descript.idx_into_rule]
var k = descript.pivot
var R = P.get_with_default(make_pair(s,k), set<int>())
var Gp = map(make_pair(s,k), set(pending(descript.nonterminal, descript.rule_idx, descript.idx_into_rule+1, descript.left)))
if grammar.is_terminal(s) {
return make_triple(matc(grammar,input,descript), map<pair<int,int>, set<Pending>>(), map<pair<int,int>, set<int>>())
} else if R.size() == 0 { // s in N
return make_triple(make_pair(descend(grammar,s,k), set<BS>()), Gp, map<pair<int,int>, set<int>>())
} else { // s in N and R != set()
return make_triple(skip(k,pending(descript.nonterminal, descript.rule_idx, descript.idx_into_rule+1, descript.left), R), Gp, map<pair<int,int>, set<int>>())
}
}
fun matc<T,K>(grammar: ref Grammer<T,K>, input: ref str, descript: Descriptor): pair<set<Descriptor>, set<BS>> {
/*println("trying to match " + grammar.to_string(grammar.get_nonterminal_rules(descript.nonterminal)[descript.rule_idx][descript.idx_into_rule]))*/
var match_length = grammar.match_terminal(grammar.get_nonterminal_rules(descript.nonterminal)[descript.rule_idx][descript.idx_into_rule], input, descript.pivot)
if match_length > 0 {
/*println("matched " + grammar.to_string(grammar.get_nonterminal_rules(descript.nonterminal)[descript.rule_idx][descript.idx_into_rule]))*/
return make_pair(set(descriptor(descript.nonterminal, descript.rule_idx, descript.idx_into_rule+1, descript.left, descript.pivot+match_length)), set(bs(descript.nonterminal, descript.rule_idx, descript.idx_into_rule+1, descript.left, descript.pivot, descript.pivot+match_length)))
} else {
return make_pair(set<Descriptor>(), set<BS>())
}
}
fun skip(k: int, c: Pending, R: ref set<int>): pair<set<Descriptor>, set<BS>> { return nmatch(k, set(c), R); }
fun ascend(k:int, K: ref set<Pending>, r: int): pair<set<Descriptor>, set<BS>> { return nmatch(k, K, set(r)); }
fun nmatch(k:int, K: ref set<Pending>, R: ref set<int>): pair<set<Descriptor>, set<BS>> {
var Rp = set<Descriptor>()
var Y = set<BS>()
for (var i = 0; i < K.data.size; i++;) {
var pending = K.data[i]
for (var j = 0; j < R.data.size; j++;) {
var r = R.data[j]
Rp.add(descriptor(pending.nonterminal, pending.rule_idx, pending.idx_into_rule, pending.left, r))
Y.add(bs(pending.nonterminal, pending.rule_idx, pending.idx_into_rule, pending.left, k, r))
}
}
return make_pair(Rp,Y)
}
/*fun main(argc: int, argv: **char): int {*/
/*var grammar.construct(): Grammer<int>*/
/*var Number = grammar.add_new_nonterminal("Number", vec(grammar.add_terminal("[0-9]+", fun(input: ref str, l: int, r: int): int { return string_to_num<int>(input.slice(l,r)); })), fun(i: ref vec<int>): int { return i[0]; })*/
/*var mult = grammar.add_terminal("\\*", fun(input: ref str, l: int, r: int): int { return 1; })*/
/*var Factor = grammar.add_new_nonterminal("Factor", vec(Number), fun(i: ref vec<int>): int { return i[0]; })*/
/*grammar.add_to_nonterminal(Factor, vec(Factor, mult, Number), fun(i: ref vec<int>): int { return i[0]*i[2]; })*/
/*var add = grammar.add_terminal("\\+", fun(input: ref str, l: int, r: int): int { return 1; })*/
/*var Term = grammar.add_new_nonterminal("Term", vec(Factor), fun(i: ref vec<int>): int { return i[0]; })*/
/*grammar.add_to_nonterminal(Term, vec(Term, add, Factor), fun(i: ref vec<int>): int { return i[0]+i[2]; })*/
/*grammar.set_start_symbol(Term)*/
/*var input = str("1+23*44")*/
/*var BSR = fungll(grammar, input)*/
/*println(str("length of BSR is: ") + BSR.size())*/
/*for (var i = 0; i < BSR.data.size; i++;) {*/
/*var BS = BSR.data[i]*/
/*println(str() + i + ": " + grammar.to_string(BSR.data[i]))*/
/*}*/
/*var res = grammar.eval_BSR(input, BSR)*/
/*println(str("result of grammar.eval_BSR(fungll(grammar, ") + input + ")) = " + res)*/
/*return 0*/
/*}*/

View File

@@ -1,7 +0,0 @@
correctly importing / running tests is a nightmare with relative paths.
Namespaces
Imports allow renaming of either entire scope or individual members, and can import from within a scope
Fix // comment right before top level function declaration. Something to do
with the return as /* comment */ does not have that problem

1870
k.krak

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,220 +0,0 @@
import io:*
import grammer:*
import lexer:*
import parser:*
import str:*
import util:*
import symbol:*
import tree:*
import serialize:*
import interpreter:*
import bytecode_generator:*
import os:*
import ast_transformation:*
import importer:*
import adt_lower:*
import obj_lower:*
import defer_lower:*
import function_value_lower:*
import ref_lower:*
import ctce_lower:*
import address_of_ensure_variable_lower:*
import c_line_control:*
import node_counter:*
import c_generator:*
import vec:*
import set:*
fun main(argc: int, argv: **char):int {
var curr_time = get_time()
// delay construction until we either load it or copy construct it
var gram: grammer
var base_dir = str("/").join(str(argv[0]).split('/').slice(0,-2))
var file_name = base_dir + "/krakenGrammer.kgm"
var compiled_name = file_name + str(".comp_new")
var compiled_version = 1
var file_contents = read_file(file_name)
var loaded_and_valid = false
var doing_repl = false
if (argc <= 1) {
println("No input file!\n Call with one argument (the input file), or two arguments (input file and output name)\n Falling into REPL...")
compiled_name += ".expr"
file_contents = str("RealGoal = boolean_expression ;\n") + file_contents
doing_repl = true
} else if (str(argv[1]) == "-v" || str(argv[1]) == "--version") {
/*var version_c_string = #ctce(fun(): *char {*/
/*var version_string = str("Self-hosted Kraken compiler \"Kalypso\" - revision ") + from_system_command(str("git rev-list HEAD | wc -l"), 100) +*/
/*", commit: " + from_system_command(str("git rev-parse HEAD"), 100) +*/
/*", compile date: " + from_system_command(str("date"), 100) */
/*return version_string.toCharArray()*/
/*}())*/
/*println(version_c_string)*/
exit(0)
}
var input_file_offset = 1
var interpret_instead = false
var opt_str = str("-O2")
var line_ctrl = false
var compile_c = true
var positional_args = vec<str>()
var flags = set<str>()
for (var i = 1; i < argc; i++;) {
var arg_str = str(argv[i])
if (arg_str == "-i") {
interpret_instead = true
} else if (arg_str.length() > 2 && arg_str.slice(0,2) == "-O") {
opt_str = arg_str
} else if (arg_str == "-g") {
line_ctrl = true
} else if (arg_str == "--no-c-compile") {
compile_c = false
} else if (arg_str.length() > 2 && arg_str.first() == '-') {
flags.add(arg_str.slice(1,-1))
} else {
positional_args.add(arg_str)
}
}
/*positional_args.for_each(fun(i:str) println("positional_arg: " + i);)*/
flags.for_each(fun(i:str) println("flag: " + i);)
if (file_exists(compiled_name)) {
var pos = 0
var binary = read_file_binary(compiled_name)
var saved_version = 0
unpack(saved_version, pos) = unserialize<int>(binary, pos)
if (saved_version == compiled_version) {
var cached_contents = str()
unpack(cached_contents, pos) = unserialize<str>(binary, pos)
if (cached_contents == file_contents) {
loaded_and_valid = true
pos = gram.unserialize(binary, pos)
} else println("contents different")
} else println("version number different")
} else {
println("cached file does not exist")
}
if (!loaded_and_valid) {
println("Not loaded_and_valid, re-generating and writing out")
// since we now don't construct before hand
gram.copy_construct(&load_grammer(file_contents))
println("grammer loaded, calculate_first_set")
gram.calculate_first_set()
println("grammer loaded, calculate_state_automaton")
gram.calculate_state_automaton()
println("calculated, writing out")
write_file_binary(compiled_name, serialize(compiled_version) + serialize(file_contents) + serialize(gram))
println("done writing")
curr_time = split(curr_time, "Grammer regen")
}
var lex = lexer(gram.terminals)
var parse1.construct(&gram, &lex): parser
/*var parse2.construct(&gram): parser*/
/*var parse3.construct(&gram): parser*/
/*var parse4.construct(&gram): parser*/
/*var parse5.construct(&gram): parser*/
/*var parse6.construct(&gram): parser*/
/*var parse7.construct(&gram): parser*/
/*var parse8.construct(&gram): parser*/
var ast_pass.construct(): ast_transformation
var parsers = vec(parse1)
/*var parsers = vec(parse1,parse2,parse3,parse4)*/
/*var parsers = vec(parse1,parse2,parse3,parse4,parse5,parse6)*/
/*var parsers = vec(parse1,parse2,parse3,parse4,parse5,parse6,parse7,parse8)*/
// This is our REPL loop
var scope = _translation_unit(str("stdin"))
if (doing_repl) {
/*var globals = setup_globals(importer.name_ast_map)*/
while (doing_repl) {
var line = get_line(str("> "), 100)
if (line == "end")
return 0
var parse = parse1.parse_input(line, str("stdin"))
trim(parse)
var ast_expression = ast_pass.transform_expression(parse, scope, map<str, *type>())
print_value(evaluate_constant_expression(ast_expression))
/*print_value(evaluate_with_globals(ast_expression, &globals))*/
}
}
var kraken_file_name = positional_args[0]
var executable_name = str(".").join(kraken_file_name.split('.').slice(0,-2))
if (positional_args.size > 1)
executable_name = positional_args[1]
curr_time = split(curr_time, "Finish setup")
var name_ast_map = import(kraken_file_name, parsers, ast_pass, vec(str(), base_dir + "/stdlib/"))
curr_time = split(curr_time, "Import")
// Passes
/*printlnerr("Counting Nodes")*/
/*node_counter(&name_ast_map, &ast_pass.ast_to_syntax)*/
/*printlnerr("Lowering ADTs")*/
adt_lower(&name_ast_map, &ast_pass.ast_to_syntax)
curr_time = split(curr_time, "Lowering ADTs")
/*printlnerr("Counting Nodes")*/
/*node_counter(&name_ast_map, &ast_pass.ast_to_syntax)*/
/*printlnerr("Lowering Objects")*/
obj_lower(&name_ast_map, &ast_pass.ast_to_syntax)
curr_time = split(curr_time, "Lowering Objects")
/*printlnerr("Counting Nodes")*/
/*node_counter(&name_ast_map, &ast_pass.ast_to_syntax)*/
/*printlnerr("Lowering Defer")*/
defer_lower(&name_ast_map, &ast_pass.ast_to_syntax)
curr_time = split(curr_time, "Lowering Defer")
/*printlnerr("Counting Nodes")*/
/*node_counter(&name_ast_map, &ast_pass.ast_to_syntax)*/
// Should come after lowering of ADTs and before lowering of Refs
/*printlnerr("Lowering Function Values (Lambdas, etc)")*/
function_value_lower(&name_ast_map, &ast_pass.ast_to_syntax)
curr_time = split(curr_time, "Lowering Function Values (Lambdas, etc)")
/*printlnerr("Counting Nodes")*/
/*node_counter(&name_ast_map, &ast_pass.ast_to_syntax)*/
/*printlnerr("Lowering Ref")*/
ref_lower(&name_ast_map, &ast_pass.ast_to_syntax)
curr_time = split(curr_time, "Lowering Ref")
/*printlnerr("Counting Nodes")*/
/*node_counter(&name_ast_map, &ast_pass.ast_to_syntax)*/
// Lowers #ctce and the current #ctce_pass
/*printlnerr("Lowering CTCE")*/
ctce_lower(&name_ast_map, &ast_pass.ast_to_syntax)
curr_time = split(curr_time, "Lowering CTCE")
/*printlnerr("Counting Nodes")*/
/*node_counter(&name_ast_map, &ast_pass.ast_to_syntax)*/
// Makes sure that & always takes reference to a variable
/*printlnerr("Lowering & to always have variable")*/
address_of_ensure_variable_lower(&name_ast_map, &ast_pass.ast_to_syntax)
curr_time = split(curr_time, "Lowering & to always have variable")
if (interpret_instead) {
/*printlnerr("Interpreting!")*/
/*call_main(name_ast_map)*/
printlnerr("Generating bytecode!")
var generator.construct(): bytecode_generator
/*var bytecode = generator.generate_bytecode(name_ast_map)*/
generator.generate_bytecode(name_ast_map)
/*printlnerr(bytecode_to_string(bytecode))*/
printlnerr("return code is ")
printlnerr(to_string(generator.evaluate()))
} else {
if (line_ctrl) {
printlnerr("running C-specific passes")
printlnerr("running #line")
c_line_control(&name_ast_map, &ast_pass.ast_to_syntax)
}
/*printlnerr("Generating C")*/
var c_generator.construct(): c_generator
var c_output_pair = c_generator.generate_c(name_ast_map, ast_pass.ast_to_syntax)
var kraken_c_output_name = kraken_file_name + ".c"
write_file(kraken_c_output_name, c_output_pair.first)
curr_time = split(curr_time, "Generating C")
if (compile_c) {
var compile_string = "cc -g " + opt_str + " -Wno-int-to-pointer-cast -Wno-pointer-to-int-cast -Wno-incompatible-pointer-types -std=c99 " + c_output_pair.second + " " + kraken_c_output_name + " -o " + executable_name
printlnerr(compile_string)
system(compile_string)
curr_time = split(curr_time, "Compiling C")
}
}
return 0
}

View File

@@ -1,158 +0,0 @@
Goal = translation_unit ;
cast_expression = "\(" WS boolean_expression WS "\)" WS "cast" WS type ;
translation_unit = WS unorderd_list_part WS ;
unorderd_list_part = import WS unorderd_list_part | function WS unorderd_list_part | type_def line_end WS unorderd_list_part | adt_def line_end WS unorderd_list_part | if_comp WS unorderd_list_part | simple_passthrough WS unorderd_list_part | declaration_statement line_end WS unorderd_list_part | compiler_intrinsic line_end WS unorderd_list_part | import | function | type_def line_end | adt_def line_end | if_comp | simple_passthrough | declaration_statement line_end | compiler_intrinsic line_end ;
type = "ref" WS pre_reffed | pre_reffed ;
pre_reffed = "\*" WS pre_reffed | "void" | "bool" | "char" | "uchar" | "short" | "ushort" | "int" | "uint" | "long" | "ulong" | "float" | "double" | scoped_identifier | scoped_identifier WS template_inst | function_type ;
function_type = "fun" WS "\(" WS opt_type_list WS "\)" WS ":" WS type | "run" WS "\(" WS opt_type_list WS "\)" WS ":" WS type ;
dec_type = ":" WS type ;
opt_type_list = type_list | ;
template_inst = "<" WS type_list WS ">" ;
type_list = type_list WS "," WS type | type ;
template_dec = "<" WS template_param_list WS ">" ;
template_param_list = template_param_list WS "," WS template_param | template_param ;
template_param = identifier WS traits | identifier ;
import = "import" WS identifier line_end | "import" WS identifier WS ":" WS "\*" line_end | "import" WS identifier WS ":" WS identifier_list line_end ;
identifier_list = identifier | identifier WS "," WS identifier_list ;
# all for optional semicolons
line_break = "
+" ;
# why use line_white here but not below? who knows. It's wayy faster this way. Or maybe when I changed it there was a typing mistake. Noone knows.
line_white = "( | )+" ;
actual_white = line_white | line_break | line_break actual_white | line_white actual_white ;
# Why is WS comment necessary? The null case SHOULD handle it, I think. I'm just a tad worred......
WS = actual_white | WS comment WS | WS comment | ;
# cpp_comment lets us do stuff like ending a statement with a cpp comment - c comments already work as they don't eat the return
maybe_line_white = "( | )+" | c_comment | maybe_line_white c_comment | maybe_line_white "( | )+" | ;
line_end = maybe_line_white ";" | maybe_line_white line_break | maybe_line_white cpp_comment ;
#line_end = maybe_line_white ";" | maybe_line_white line_break | maybe_line_white cpp_comment | WS c_comment line_end ;
# line_end = "( | )+" ";" | "( | )+" line_break | "( | )+" cpp_comment | ";" | line_break | cpp_comment ;
# line_end = WS ";" | WS line_break | WS cpp_comment ;
# line_end = "( | )+" ending | ending ;
# ending = ";" | line_break | cpp_comment ;
if_comp = "__if_comp__" WS identifier WS statement ;
#if_comp = "__if_comp__" WS identifier WS if_comp_pred ;
#if_comp_pred = code_block | simple_passthrough ;
simple_passthrough = "simple_passthrough" WS passthrough_params WS triple_quoted_string ;
passthrough_params = "\(" WS in_passthrough_params WS ":" WS out_passthrough_params WS ":" WS opt_string WS "\)" | ;
in_passthrough_params = opt_param_assign_list ;
out_passthrough_params = opt_param_assign_list ;
opt_param_assign_list = param_assign_list | ;
param_assign_list = param_assign WS "," WS param_assign_list | param_assign ;
param_assign = identifier WS "=" WS identifier | identifier ;
opt_string = string | ;
triple_quoted_string = "\"\"\"((\"\"(`|[0-9]|-|=| |[a-z]|\[|]|\\|;|'|
|,|.|/|~|!|@|#|$|%|^|&|\*|\(|\)|_|\+|[A-Z]|{|}|\||:|<|>|\?| )+)|(\"(`|[0-9]|-|=| |[a-z]|\[|]|\\|'|
|,|.|/|~|!|@|#|$|%|^|&|\*|\(|\)|_|\+|[A-Z]|{|}|\||:|<|>|\?| )+))*(`|[0-9]|-|=| |[a-z]|\[|]|\\|;|'|
|,|.|/|~|!|@|#|$|%|^|&|\*|\(|\)|_|\+|[A-Z]|{|}|\||:|<|>|\?| )*(((`|[0-9]|-|=| |[a-z]|\[|]|\\|;|'|
|,|.|/|~|!|@|#|$|%|^|&|\*|\(|\)|_|\+|[A-Z]|{|}|\||:|<|>|\?| )+\")|((`|[0-9]|-|=| |[a-z]|\[|]|\\|;|'|
|,|.|/|~|!|@|#|$|%|^|&|\*|\(|\)|_|\+|[A-Z]|{|}|\||:|<|>|\?| )+\"\")|((`|[0-9]|-|=| |[a-z]|\[|]|\\|;|'|
|,|.|/|~|!|@|#|$|%|^|&|\*|\(|\)|_|\+|[A-Z]|{|}|\||:|<|>|\?| )+))*\"\"\"" ;
#identifier = alpha_alphanumeric ;
identifier = augmented_alpha_alphanumeric ;
scope_op = ":" ":" ;
scoped_identifier = scoped_identifier WS scope_op WS identifier | identifier ;
#Note that to prevent confilct with nested templates (T<A<B>>) right_shift is a nonterminal contructed as follows
right_shift = ">" ">" ;
overloadable_operator = "\+" | "-" | "\*" | "/" | "%" | "^" | "&" | "\|" | "~" | "!" | "," | "=" | "\+\+" | "--" | "<<" | "<" | ">" | "<=" | ">=" | right_shift | "==" | "!=" | "&&" | "\|\|" | "\+=" | "-=" | "/=" | "%=" | "^=" | "&=" | "\|=" | "\*=" | "<<=" | ">>=" | "->" | "\(" "\)" | "\[]" | "\[]=" ;
func_identifier = identifier | identifier overloadable_operator ;
# allow omitting of return type (automatic void)
# HACKY - typed_return has it's own internal whitespace as to not make WS typed_return-reduces to null WS ambigious
typed_return = WS dec_type | ;
function = "ext" WS "fun" WS func_identifier WS "\(" WS opt_typed_parameter_list WS "\)" typed_return | "fun" WS func_identifier WS template_dec WS "\(" WS opt_typed_parameter_list WS "\)" typed_return WS statement | "fun" WS func_identifier WS "\(" WS opt_typed_parameter_list WS "\)" typed_return WS statement ;
lambda = "fun" WS "\(" WS opt_typed_parameter_list WS "\)" typed_return WS statement ;
opt_typed_parameter_list = typed_parameter_list | typed_parameter_list WS "," WS "..." | ;
typed_parameter_list = typed_parameter_list WS "," WS typed_parameter | typed_parameter ;
typed_parameter = identifier WS dec_type ;
opt_parameter_list = parameter_list | ;
parameter_list = parameter_list WS "," WS parameter | parameter ;
parameter = boolean_expression ;
obj_nonterm = "obj" | "uni" ;
type_def = obj_nonterm WS identifier WS template_dec WS "{" WS declaration_block WS "}" | obj_nonterm WS identifier WS "{" WS declaration_block WS "}" | obj_nonterm WS identifier WS template_dec WS traits WS "{" WS declaration_block WS "}" | obj_nonterm WS identifier WS traits WS "{" WS declaration_block WS "}" ;
declaration_block = declaration_statement line_end WS declaration_block | function WS declaration_block | declaration_statement line_end | function | ;
traits = "\(" WS trait_list WS "\)" ;
trait_list = trait_list WS "," WS scoped_identifier | scoped_identifier ;
adt_nonterm = "adt" ;
adt_def = adt_nonterm WS identifier WS "{" WS adt_option_list WS "}" | adt_nonterm WS identifier WS template_dec WS "{" WS adt_option_list WS "}" ;
adt_option_list = adt_option | adt_option WS "," WS adt_option_list ;
adt_option = identifier | identifier WS dec_type ;
if_statement = "if" WS boolean_expression WS statement | "if" WS boolean_expression WS statement WS "else" WS statement ;
match_statement = "match" WS "\(" WS boolean_expression WS "\)" WS "{" WS case_statement_list WS "}" ;
case_statement_list = case_statement WS case_statement_list | case_statement ;
case_statement = scoped_identifier WS "\(" WS identifier WS "\)" WS statement | scoped_identifier WS "\(" WS "\)" WS statement ;
while_loop = "while" WS boolean_expression WS statement ;
for_loop = "for" WS "\(" WS statement WS boolean_expression line_end WS statement WS "\)" WS statement ;
return_statement = "return" | "return" WS boolean_expression ;
code_block = "{" WS statement_list WS "}" | "{" WS "}" ;
statement_list = statement_list WS statement | statement ;
statement = if_statement | match_statement | while_loop | for_loop | return_statement line_end | boolean_expression line_end | assignment_statement line_end | declaration_statement line_end | code_block | if_comp | simple_passthrough | break_statement | continue_statement | defer_statement ;
break_statement = "break" ;
continue_statement = "continue" ;
defer_statement = "defer" WS statement ;
function_call = unarad "\(" WS opt_parameter_list WS "\)" ;
compiler_intrinsic = "#" identifier WS "\(" WS opt_parameter_list WS "\)" | "#" identifier WS "<" WS type_list WS ">" ;
boolean_expression = boolean_expression WS "\|\|" WS and_boolean_expression | and_boolean_expression ;
and_boolean_expression = and_boolean_expression WS "&&" WS bitwise_or | bitwise_or ;
bitwise_or = bitwise_or WS "\|" WS bitwise_xor | bitwise_xor ;
bitwise_xor = bitwise_xor WS "^" WS bitwise_and | bitwise_and ;
bitwise_and = bitwise_and WS "&" WS bool_exp | bool_exp ;
bool_exp = expression WS comparator WS expression | expression ;
comparator = "==" | "<=" | ">=" | "!=" | "<" | ">" ;
expression = expression WS "<<" WS term | expression WS right_shift WS shiftand | shiftand ;
shiftand = shiftand WS "-" WS term | shiftand WS "\+" WS term | term ;
term = term WS "/" WS factor | term WS "\*" WS factor | term WS "%" WS factor | factor ;
factor = "\+\+" WS unarad | unarad WS "\+\+" | "--" WS unarad | unarad WS "--" | "\+" WS unarad | "-" WS unarad | "!" WS unarad | "~" WS unarad | "\*" WS unarad | "&" WS unarad | unarad ;
unarad = number | scoped_identifier | scoped_identifier WS template_inst | access_operation | function_call | compiler_intrinsic | bool | string | character | "\(" WS boolean_expression WS "\)" | unarad WS "\[" WS expression WS "]" | lambda | cast_expression ;
cast_expression = "\(" WS boolean_expression WS "\)" WS "cast" WS type ;
number = integer | floating_literal ;
access_operation = unarad WS "." WS identifier | unarad WS "->" WS identifier | unarad WS "." WS identifier WS template_inst | unarad WS "->" WS identifier WS template_inst ;
assignment_statement = factor WS "=" WS boolean_expression | factor WS "\+=" WS boolean_expression | factor WS "-=" WS boolean_expression | factor WS "\*=" WS boolean_expression | factor WS "/=" WS boolean_expression | factor WS "^=" WS boolean_expression ;
# if it's being assigned to, we allow type inferencing
declaration_statement = "var" WS identifier WS "=" WS boolean_expression | "var" WS identifier WS dec_type WS "=" WS boolean_expression | "var" WS identifier WS dec_type | "ext" WS "var" WS identifier WS dec_type | "var" WS identifier WS "." WS identifier WS "\(" WS opt_parameter_list WS "\)" WS dec_type ;
hexadecimal = "0x([0-9]|[a-f])+" ;
integer = "[0-9]+u?(c|s|l)?" | hexadecimal ;
floating_literal = "[0-9]+.[0-9]+(f|d)?" ;
bool = "true" | "false" ;
character = "'(`|[0-9]|-|=|(\\t)|[a-z]|\[|]|(\\\\)|;|(\\')|(\\n)|,|.|/|~|!|@|#|$|%|^|&|\*|\(|\)|_|\+|[A-Z]|{|}|\||:|\"|<|>|\?| |(\\0))'" ;
keywords_also_identifiers = "obj" | "def" | "fun" | "run" | "var" | "ref" | "adt" | "cast" | "import" | "simple_passthrough" ;
alpha_alphanumeric = "([a-z]|[A-Z]|_)([a-z]|[A-Z]|_|[0-9])*" ;
augmented_alpha_alphanumeric = alpha_alphanumeric augmented_alpha_alphanumeric | keywords_also_identifiers augmented_alpha_alphanumeric | alpha_alphanumeric | keywords_also_identifiers ;
# note the hacks around \things. Hmm, I feel like it actually shouldn't be like this. Added \\\* because I want to come back later
string = triple_quoted_string | "\"([#-[]| |[]-~]|(\\\\)|(\\n)|(\\t)|(\\\*)|(\\0)|
|[ -!]|(\\\"))*\"" ;
comment = cpp_comment | c_comment ;
cpp_comment = "//[ -~]*
" ;
c_comment = "(/\*/*\**(([ -)]|[0-~]|[+-.]| |
)/*\**)+\*/)|(/\*\*/)" ;

View File

@@ -1,6 +0,0 @@
Kraken
filter remove_matches ^\s*//
filter remove_inline //.*$
filter call_regexp_common C
extension krak
3rd_gen_scale 1.0

View File

@@ -1,78 +0,0 @@
import vector_literals: *
import string:*
import io:*
import mem: *
import util: *
import os: *
obj LLVMModule {}
obj LLVMType {}
obj LLVMValue {}
obj LLVMGenericValue {}
obj LLVMBasicBlock {}
obj LLVMBuilder {}
obj LLVMExecutionEngine {}
ext fun LLVMModuleCreateWithName(ModuleID: *char): *LLVMModule
ext fun LLVMInt32Type(): *LLVMType
ext fun LLVMFunctionType(ret_type: *LLVMType, param_types: **LLVMType, ParamCount: uint, isVarArg: int): *LLVMType
ext fun LLVMAddFunction(mod: *LLVMModule, name: *char, rettype: *LLVMType): *LLVMValue
ext fun LLVMAppendBasicBlock(func: *LLVMValue, name: *char): *LLVMBasicBlock
ext fun LLVMCreateBuilder(): *LLVMBuilder
ext fun LLVMPositionBuilderAtEnd(builder: *LLVMBuilder, block: *LLVMBasicBlock)
ext fun LLVMGetParam(func: *LLVMValue, num: int): *LLVMValue
ext fun LLVMBuildAdd(builder: *LLVMBuilder, first: *LLVMValue, second: *LLVMValue, name: *char): *LLVMValue
ext fun LLVMBuildRet(builder: *LLVMBuilder, value: *LLVMValue): *LLVMValue
ext fun LLVMVerifyModule(M: *LLVMModule, Action: int, error: **char): int
var LLVMAbortProcessAction = 1
ext fun LLVMDisposeMessage(error: *char)
ext fun LLVMLinkInMCJIT()
/*ext fun LLVMInitializeNativeTarget(): bool*/
ext fun LLVMInitializeX86Target(): bool
ext fun LLVMCreateExecutionEngineForModule(engine: **LLVMExecutionEngine, M: *LLVMModule, error: **char): int
ext fun LLVMCreateGenericValueOfInt(Ty: *LLVMType, N: ulong, IsSigned: int): *LLVMGenericValue
ext fun LLVMRunFunction(EE: *LLVMExecutionEngine, F: *LLVMValue, NumArgs: uint, Args: **LLVMGenericValue): *LLVMGenericValue
ext fun LLVMGenericValueToInt(GenVal: *LLVMGenericValue, IsSigned: int): ulong
#link("LLVM-3.8")
fun main(argc: int, argv: **char): int {
var mod = LLVMModuleCreateWithName("my_module")
var param_types = vector(LLVMInt32Type(), LLVMInt32Type())
var ret_type = LLVMFunctionType(LLVMInt32Type(), param_types.getBackingMemory(), (2) cast uint, 0)
var sum = LLVMAddFunction(mod, "sum", ret_type)
var entry = LLVMAppendBasicBlock(sum, "entry")
var builder = LLVMCreateBuilder()
LLVMPositionBuilderAtEnd(builder, entry)
var tmp = LLVMBuildAdd(builder, LLVMGetParam(sum,0), LLVMGetParam(sum,1), "tmp")
LLVMBuildRet(builder, tmp)
var error = null<char>()
LLVMVerifyModule(mod, LLVMAbortProcessAction, &error)
LLVMDisposeMessage(error)
var engine: *LLVMExecutionEngine
error = null<char>()
LLVMLinkInMCJIT()
/*LLVMInitializeNativeTarget()*/
// LLVMInitializeNativeTarget is static/inline :/
LLVMInitializeX86Target()
if (LLVMCreateExecutionEngineForModule(&engine, mod, &error)) {
error("Failed to create execution engine")
}
if (error) {
println(string("error: ") + error)
LLVMDisposeMessage(error)
exit(1)
}
if (argc < 3) error(string("usage: ") + argv[0] + " x y")
var x = string_to_num<ulong>(string(argv[1]))
var y = string_to_num<ulong>(string(argv[2]))
var args = vector(LLVMCreateGenericValueOfInt(LLVMInt32Type(), x, 0),
LLVMCreateGenericValueOfInt(LLVMInt32Type(), y, 0))
var res = LLVMRunFunction(engine, sum, 2u, args.getBackingMemory())
println(string("result: ") + (LLVMGenericValueToInt(res, 0)) cast int)
return 0
}

1578
mal.krak

File diff suppressed because it is too large Load Diff

View File

@@ -4,13 +4,7 @@ with import <nixpkgs> { };
mkShell {
LANG="en_US.UTF-8";
nativeBuildInputs = [
emscripten
nodejs
valgrind
kcachegrind
chicken
chez
racket
wabt
wasmtime
wasm3

Binary file not shown.

View File

@@ -1,51 +0,0 @@
import symbol:*
import tree:*
import map:*
import util:*
import str:*
import io:*
import ast_nodes:*
import ast_transformation:*
import pass_common:*
fun address_of_ensure_variable_lower(name_ast_map: *map<str, pair<*tree<symbol>,*ast_node>>, ast_to_syntax: *map<*ast_node, *tree<symbol>>) {
var visited = hash_set<*ast_node>()
name_ast_map->for_each(fun(name: str, syntax_ast_pair: pair<*tree<symbol>,*ast_node>) {
var helper_before = fun(node: *ast_node, parent_chain: *stack<*ast_node>) {
match(*node) {
ast_node::function_call(backing) {
if (is_function(backing.func) && backing.func->function.name == "&") {
var addresse = backing.parameters[0]
// Identifier is always fine. The other options are
// function call, value, or cast. The only fine one here
// is a function call of *
if ( (is_function_call(addresse) &&
!(is_function(addresse->function_call.func) &&
(addresse->function_call.func->function.name == "*" ||
addresse->function_call.func->function.name == "." ||
addresse->function_call.func->function.name == "->" ||
addresse->function_call.func->function.name == "[]"))) ||
is_value(addresse) || is_cast(addresse) ) {
// so we're a function call that's not * or a cast or value
// so make a temp variable for us
// Note that we don't have to worry about destruction because
// all object stuff has already ran, so this isn't an object
var enclosing_block_idx = parent_chain->index_from_top_satisfying(is_code_block)
var enclosing_block = parent_chain->from_top(enclosing_block_idx)
var before_block_parent = parent_chain->from_top(enclosing_block_idx-1)
var ident = _ident("for_address_of_temp",
backing.func->function.type->parameter_types[0],
enclosing_block)
var decl = _declaration(ident, addresse)
add_before_in(decl, before_block_parent, enclosing_block)
replace_with_in(addresse, ident, node)
}
}
}
}
}
run_on_tree(helper_before, empty_pass_second_half(), syntax_ast_pair.second, &visited)
})
}

View File

@@ -1,200 +0,0 @@
import symbol:*
import tree:*
import vec:*
import map:*
import util:*
import str:*
import mem:*
import io:*
import ast_nodes:*
import ast_transformation:*
import hash_set:*
import pass_common:*
fun adt_lower(name_ast_map: *map<str, pair<*tree<symbol>,*ast_node>>, ast_to_syntax: *map<*ast_node, *tree<symbol>>) {
var type_def_option_map = map<*ast_node, vec<*ast_node>>()
var visited1 = hash_set<*ast_node>()
var visited2 = hash_set<*ast_node>()
name_ast_map->for_each(fun(name: str, syntax_ast_pair: pair<*tree<symbol>,*ast_node>) {
var helper_before = fun(node: *ast_node, parent_chain: *stack<*ast_node>) {
match(*node) {
ast_node::adt_def(backing) {
/*println(backing.name + ": transforming!")*/
type_def_option_map[node] = vec<*ast_node>()
var replacement = _type_def(backing.name, false);
// we're going to be replacing adt_def in the same ptr, so this works
replacement->type_def.self_type = node->adt_def.self_type
var option_union = _type_def(backing.name + "_union", true);
node->adt_def.options.for_each(fun(opt: *ast_node) {
if (!opt->identifier.type->is_empty_adt_option())
option_union->type_def.variables.add(_declaration(opt, null<ast_node>()))
type_def_option_map[node].add(opt)
})
var option_union_type = type_ptr(option_union)
option_union->type_def.self_type = option_union_type
var option_union_ident = _ident(str("data"), option_union_type, replacement)
replacement->type_def.variables.add(_declaration(option_union_ident, null<ast_node>()))
add_to_scope("data", option_union_ident, replacement)
var flag = _ident(str("flag"), type_ptr(base_type::integer()), replacement)
replacement->type_def.variables.add(_declaration(flag, null<ast_node>()))
add_to_scope("flag", flag, replacement)
add_before_in(option_union, node, parent_chain)
var enclosing_scope = node->adt_def.scope[str("~enclosing_scope")][0]
var idx = 0
node->adt_def.option_funcs.for_each(fun(func: *ast_node) {
var adt_type = replacement->type_def.self_type
var block = _code_block()
add_to_scope("~enclosing_scope", func, block)
func->function.body_statement = block
var to_ret = _ident(str("to_ret"), adt_type, block)
block->code_block.children.add(_declaration(to_ret, null<ast_node>()))
var value = _value(to_string(idx), type_ptr(base_type::integer()))
block->code_block.children.add(_assign(make_operator_call(".", vec(to_ret, flag)), value))
var opt = type_def_option_map[node][idx]
var lvalue = make_operator_call(".", vec(make_operator_call(".", vec(to_ret, option_union_ident)), opt))
if (func->function.parameters.size) {
// do copy_construct if it should
block->code_block.children.add(assign_or_copy_construct_statement(lvalue, func->function.parameters[0]))
}
block->code_block.children.add(_return(to_ret))
add_before_in(func, node, parent_chain)
add_to_scope(func->function.name, func, enclosing_scope)
add_to_scope("~enclosing_scope", enclosing_scope, func)
idx++
})
node->adt_def.regular_funcs.for_each(fun(func: *ast_node) {
var block = _code_block()
add_to_scope("~enclosing_scope", func, block)
func->function.body_statement = block
var func_this = func->function.this_param
if (func->function.name == "operator==") {
var other = func->function.parameters[0]
var if_stmt = _if(make_operator_call("!=", vec(make_operator_call("->", vec(func_this, flag)), make_operator_call(".", vec(other, flag)))))
if_stmt->if_statement.then_part = _return(_value(str("false"), type_ptr(base_type::boolean())))
block->code_block.children.add(if_stmt)
for (var i = 0; i < type_def_option_map[node].size; i++;) {
if (get_ast_type(type_def_option_map[node][i])->is_empty_adt_option())
continue
var if_stmt_inner = _if(make_operator_call("==", vec(make_operator_call("->", vec(func_this, flag)), _value(to_string(i), type_ptr(base_type::integer())))))
var option = type_def_option_map[node][i]
var our_option = make_operator_call(".", vec(make_operator_call("->", vec(func_this, option_union_ident)), option))
var their_option = make_operator_call(".", vec(make_operator_call(".", vec(other, option_union_ident)), option))
if_stmt_inner->if_statement.then_part = _return(possible_object_equality(our_option, their_option))
block->code_block.children.add(if_stmt_inner)
}
block->code_block.children.add(_return(_value(str("true"), type_ptr(base_type::boolean()))))
} else if (func->function.name == "operator!=") {
var other = func->function.parameters[0]
block->code_block.children.add(_return(make_operator_call("!", vec(make_method_call(func_this, "operator==", vec(other))))))
} else if (func->function.name == "construct") {
var value = _value(str("-1"), type_ptr(base_type::integer()))
block->code_block.children.add(_assign(make_operator_call("->", vec(func_this, flag)), value))
block->code_block.children.add(_return(func_this))
} else if (func->function.name == "copy_construct") {
var other = func->function.parameters[0]
block->code_block.children.add(_assign(make_operator_call("->", vec(func_this, flag)), make_operator_call("->", vec(other, flag))))
for (var i = 0; i < type_def_option_map[node].size; i++;) {
if (get_ast_type(type_def_option_map[node][i])->is_empty_adt_option())
continue
var if_stmt_inner = _if(make_operator_call("==", vec(make_operator_call("->", vec(func_this, flag)), _value(to_string(i), type_ptr(base_type::integer())))))
var option = type_def_option_map[node][i]
var our_option = make_operator_call(".", vec(make_operator_call("->", vec(func_this, option_union_ident)), option))
var their_option = make_operator_call(".", vec(make_operator_call("->", vec(other, option_union_ident)), option))
if_stmt_inner->if_statement.then_part = assign_or_copy_construct_statement(our_option, their_option)
block->code_block.children.add(if_stmt_inner)
}
block->code_block.children.add(_return(func_this))
} else if (func->function.name == "operator=") {
var other = func->function.parameters[0]
block->code_block.children.add(make_method_call(func_this, "destruct", vec<*ast_node>()))
block->code_block.children.add(make_method_call(func_this, "copy_construct", vec(make_operator_call("&", vec(other)))))
} else if (func->function.name == "destruct") {
for (var i = 0; i < type_def_option_map[node].size; i++;) {
var option = type_def_option_map[node][i]
var option_type = get_ast_type(option)
if (option_type->is_empty_adt_option())
continue
if (option_type->indirection == 0 && option_type->is_object() && has_method(option_type->type_def, "destruct", vec<*type>())) {
var if_stmt_inner = _if(make_operator_call("==", vec(make_operator_call("->", vec(func_this, flag)), _value(to_string(i), type_ptr(base_type::integer())))))
var our_option = make_operator_call(".", vec(make_operator_call("->", vec(func_this, option_union_ident)), option))
if_stmt_inner->if_statement.then_part = make_method_call(our_option, "destruct", vec<*ast_node>())
block->code_block.children.add(if_stmt_inner)
}
}
} else error("impossible adt method")
replacement->type_def.methods.add(func)
add_to_scope(func->function.name, func, replacement)
add_to_scope("~enclosing_scope", replacement, func)
})
add_to_scope("~enclosing_scope", enclosing_scope, option_union)
add_to_scope("~enclosing_scope", enclosing_scope, replacement)
*node = *replacement
}
}
}
run_on_tree(helper_before, empty_pass_second_half(), syntax_ast_pair.second, &visited1)
})
name_ast_map->for_each(fun(name: str, syntax_ast_pair: pair<*tree<symbol>,*ast_node>) {
var second_helper = fun(node: *ast_node, parent_chain: *stack<*ast_node>) {
match(*node) {
ast_node::match_statement(backing) {
var block = _code_block()
add_to_scope("~enclosing_scope", parent_chain->item_from_top_satisfying(fun(i: *ast_node): bool return is_code_block(i) || is_function(i);), block)
var value = backing.value
var holder = _ident(str("holder"), get_ast_type(value)->clone_with_increased_indirection(), block)
block->code_block.children.add(_declaration(holder, null<ast_node>()))
block->code_block.children.add(_assign(holder, make_operator_call("&", vec(value))))
backing.cases.for_each(fun(case_stmt: *ast_node) {
var option = case_stmt->case_statement.option
if (!get_ast_scope(get_ast_type(value)->type_def)->contains_key(str("flag")))
error("trying to get flag from struct without it - are you matching on not an adt? - " + get_ast_type(value)->to_string())
var flag = get_from_scope(get_ast_type(value)->type_def, "flag")
var data = get_from_scope(get_ast_type(value)->type_def, "data")
var option_num = -7
if (!type_def_option_map.contains_key(get_ast_type(value)->type_def))
error("trying to match on non-adt")
for (var i = 0; i < type_def_option_map[get_ast_type(value)->type_def].size; i++;)
if (type_def_option_map[get_ast_type(value)->type_def][i] == option)
option_num = i;
var condition = make_operator_call("==", vec(make_operator_call("->", vec(holder, flag)), _value(to_string(option_num), type_ptr(base_type::integer()))))
var if_stmt = _if(condition)
var inner_block = _code_block()
add_to_scope("~enclosing_scope", block, inner_block)
var unpack_ident = case_stmt->case_statement.unpack_ident
if (unpack_ident) {
var get_option = make_operator_call(".", vec(make_operator_call("->", vec(holder, data)), option))
get_option = make_operator_call("&", vec(get_option))
unpack_ident->identifier.type = unpack_ident->identifier.type->clone_with_ref()
inner_block->code_block.children.add(_declaration(unpack_ident, get_option))
}
inner_block->code_block.children.add(case_stmt->case_statement.statement)
if_stmt->if_statement.then_part = inner_block
block->code_block.children.add(if_stmt)
})
*node = *block
}
ast_node::function_call(backing) {
if (is_function(backing.func) && (backing.func->function.name == "." || backing.func->function.name == "->")) {
var left_type = get_ast_type(backing.parameters[0])
if (left_type->is_object() && is_identifier(backing.parameters[1])) {
for (var i = 0; i < left_type->type_def->type_def.variables.size; i++;)
if (left_type->type_def->type_def.variables[i]->declaration_statement.identifier == backing.parameters[1])
return;
/*println(backing.parameters[1]->identifier.name + " getting, adt . or -> call!")*/
var object = node->function_call.parameters[0]
node->function_call.parameters[0] = make_operator_call(backing.func->function.name, vec(object, get_from_scope(left_type->type_def, "data")))
node->function_call.func = get_builtin_function(".", vec(get_ast_type(get_from_scope(left_type->type_def, "data")), get_ast_type(backing.parameters[1])))
}
}
}
}
}
run_on_tree(second_helper, empty_pass_second_half(), syntax_ast_pair.second, &visited2)
})
}

View File

@@ -1,324 +0,0 @@
import tree:*
import type2:*
import vec:*
import set:*
import util:*
import str:*
import mem:*
import binding:*
adt ast {
_translation_unit: str,
_import: pair<*tree<ast>, set<str>>,
_identifier: pair<str, *binding<type>>,
_binding: triple<str, vec<*binding<type>>, *binding<tree<ast>>>,
_type_def: str,
_adt_def: str,
_function: triple<str, *binding<type>, bool>,
// needs to be a map that retains order
_template: pair<str, map<str, *binding<type>>>,
_declaration,
_block,
_if,
_match,
_case,
_while,
_for,
_return,
_break,
_continue,
_defer,
_call: bool,
_compiler_intrinsic: triple<str, *binding<type>, vec<*binding<type>>>,
_cast: *binding<type>,
_value: pair<str, *binding<type>>
}
fun deref_to_string<T>(in: *T): str
if (in == mem::null<T>())
return str("null")
else
return to_string(in)
fun deref_to_string<T>(in: *T, ts: fun(*T): str): str
if (in == mem::null<T>())
return str("null")
else
return ts(in)
fun binding_deref_to_string<T>(b: *binding<T>): str {
var pre = b->get_bound_to(binding_epoch::pre_ref())
var post = b->get_bound_to(binding_epoch::post_ref())
if pre == post {
return deref_to_string(pre)
} else {
return "pre_ref:" + deref_to_string(pre) + "/post_ref:" + deref_to_string(post)
}
}
fun binding_deref_to_string<T>(b: *binding<T>, ts: fun(*T): str): str {
var pre = b->get_bound_to(binding_epoch::pre_ref())
var post = b->get_bound_to(binding_epoch::post_ref())
if pre == post {
return deref_to_string(pre, ts)
} else {
return "pre_ref:" + deref_to_string(pre, ts) + "/post_ref:" + deref_to_string(post, ts)
}
}
fun to_string(a: ref ast): str {
match(a) {
ast::_translation_unit(b) return str("_translation_unit(") + b + ")"
ast::_import(b) return str("_import(") + to_string(b.first->data) + ")[" + str(",").join(b.second.data) + "]"
ast::_identifier(b) return str("_identifier(") + b.first + ": " + binding_deref_to_string(b.second) + ")"
ast::_binding(b) return str("_binding(") + b.first + "[" + str(",").join(b.second.map(fun(x:*binding<type>): str { return binding_deref_to_string(x); })) + "]" + "-> " + binding_deref_to_string(b.third, fun(t: *tree<ast>): str return to_string(t->data);) + ")"
ast::_type_def(b) return str("_type_def(") + b + ")"
ast::_adt_def(b) return str("_adt_def(") + b + ")"
ast::_function(b) return str("_function(") + b.first + ": " + binding_deref_to_string(b.second) + ", ext?:" + to_string(b.third) + ")"
ast::_template(b) return str("_template(") + b.first + "[" + str(",").join(b.second.keys) + "])"
ast::_declaration() return str("_declaration")
ast::_block() return str("_block")
ast::_if() return str("_if")
ast::_match() return str("_match")
ast::_case() return str("_case")
ast::_while() return str("_while")
ast::_for() return str("_for")
ast::_return() return str("_return")
ast::_break() return str("_break")
ast::_continue() return str("_continue")
ast::_defer() return str("_defer")
ast::_call(b) return "_call(add_scope: " + to_string(b) + ")"
ast::_compiler_intrinsic(b) return str("_compiler_intrinsic(") + b.first + ": " + binding_deref_to_string(b.second) + ")"
ast::_cast(b) return str("_cast")
ast::_value(b) return str("_value(") + b.first + ": " + binding_deref_to_string(b.second) + ")"
}
}
fun _translation_unit(p: str): *tree<ast> {
return new<tree<ast>>()->construct(ast::_translation_unit(p))
}
fun _import(p1: *tree<ast>, p2: set<str>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_import(make_pair(p1,p2)))
}
fun _type_def(p: str): *tree<ast> {
return new<tree<ast>>()->construct(ast::_type_def(p))
}
fun _adt_def(p: str): *tree<ast> {
return new<tree<ast>>()->construct(ast::_adt_def(p))
}
fun _cast(p: *binding<type>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_cast(p))
}
fun _identifier(p1: str, p2: *binding<type>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_identifier(make_pair(p1, p2)))
}
fun _binding(p1: str, p2: vec<*binding<type>>, p3: *binding<tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_binding(make_triple(p1, p2, p3)))
}
fun _function(p1: str, p2: *binding<type>, p3: bool): *tree<ast> {
return new<tree<ast>>()->construct(ast::_function(make_triple(p1, p2, p3)))
}
fun _template(p1: str, p2: map<str, *binding<type>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_template(make_pair(p1, p2)))
}
fun _compiler_intrinsic(p1: str, p2: *binding<type>, p3: vec<*binding<type>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_compiler_intrinsic(make_triple(p1, p2, p3)))
}
fun _value(p1: str, p2: *binding<type>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_value(make_pair(p1, p2)))
}
fun _declaration(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_declaration())
}
fun _block(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_block())
}
fun _if(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_if())
}
fun _match(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_match())
}
fun _case(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_case())
}
fun _while(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_while())
}
fun _for(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_for())
}
fun _return(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_return())
}
fun _break(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_break())
}
fun _continue(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_continue())
}
fun _defer(): *tree<ast> {
return new<tree<ast>>()->construct(ast::_defer())
}
fun _call(add_scope: bool): *tree<ast> {
return new<tree<ast>>()->construct(ast::_call(add_scope))
}
fun _translation_unit(p: str, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_translation_unit(p), c)
}
fun _import(p1: *tree<ast>, p2: set<str>, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_import(make_pair(p1,p2)), c)
}
fun _type_def(p: str, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_type_def(p), c)
}
fun _adt_def(p: str, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_adt_def(p), c)
}
fun _cast(p: *binding<type>, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_cast(p), c)
}
fun _identifier(p1: str, p2: *binding<type>, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_identifier(make_pair(p1, p2)), c)
}
fun _binding(p1: str, p2: vec<*binding<type>>, p3: *binding<tree<ast>>, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_binding(make_triple(p1, p2, p3)), c)
}
fun _function(p1: str, p2: *binding<type>, p3: bool, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_function(make_triple(p1, p2, p3)), c)
}
fun _template(p1: str, p2: map<str, *binding<type>>, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_template(make_pair(p1, p2)), c)
}
fun _compiler_intrinsic(p1: str, p2: *binding<type>, p3: vec<*binding<type>>, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_compiler_intrinsic(make_triple(p1, p2, p3)), c)
}
fun _value(p1: str, p2: *binding<type>, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_value(make_pair(p1, p2)), c)
}
fun _declaration(c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_declaration(), c)
}
fun _block(c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_block(), c)
}
fun _if(c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_if(), c)
}
fun _match(c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_match(), c)
}
fun _case(c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_case(), c)
}
fun _while(c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_while(), c)
}
fun _for(c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_for(), c)
}
fun _return(c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_return(), c)
}
fun _defer(c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_defer(), c)
}
fun _call(add_scope: bool, c: ref vec<*tree<ast>>): *tree<ast> {
return new<tree<ast>>()->construct(ast::_call(add_scope), c)
}
fun is_translation_unit(i: *tree<ast>): bool { match(i->data) { ast::_translation_unit(b) return true; } return false; }
fun is_import(i: *tree<ast>): bool { match(i->data) { ast::_import(b) return true; } return false; }
fun is_identifier(i: *tree<ast>): bool { match(i->data) { ast::_identifier(b) return true; } return false; }
fun is_binding(i: *tree<ast>): bool { match(i->data) { ast::_binding(b) return true; } return false; }
fun is_type_def(i: *tree<ast>): bool { match(i->data) { ast::_type_def(b) return true; } return false; }
fun is_adt_def(i: *tree<ast>): bool { match(i->data) { ast::_adt_def(b) return true; } return false; }
fun is_function(i: *tree<ast>): bool { match(i->data) { ast::_function(b) return true; } return false; }
fun is_template(i: *tree<ast>): bool { match(i->data) { ast::_template(b) return true; } return false; }
fun is_declaration(i: *tree<ast>): bool { match(i->data) { ast::_declaration() return true; } return false; }
fun is_block(i: *tree<ast>): bool { match(i->data) { ast::_block() return true; } return false; }
fun is_if(i: *tree<ast>): bool { match(i->data) { ast::_if() return true; } return false; }
fun is_match(i: *tree<ast>): bool { match(i->data) { ast::_match() return true; } return false; }
fun is_case(i: *tree<ast>): bool { match(i->data) { ast::_case() return true; } return false; }
fun is_while(i: *tree<ast>): bool { match(i->data) { ast::_while() return true; } return false; }
fun is_for(i: *tree<ast>): bool { match(i->data) { ast::_for() return true; } return false; }
fun is_return(i: *tree<ast>): bool { match(i->data) { ast::_return() return true; } return false; }
fun is_break(i: *tree<ast>): bool { match(i->data) { ast::_break() return true; } return false; }
fun is_continue(i: *tree<ast>): bool { match(i->data) { ast::_continue() return true; } return false; }
fun is_defer(i: *tree<ast>): bool { match(i->data) { ast::_defer() return true; } return false; }
fun is_call(i: *tree<ast>): bool { match(i->data) { ast::_call(b) return true; } return false; }
fun is_compiler_intrinsic(i: *tree<ast>): bool { match(i->data) { ast::_compiler_intrinsic(b) return true; } return false; }
fun is_cast(i: *tree<ast>): bool { match(i->data) { ast::_cast(b) return true; } return false; }
fun is_value(i: *tree<ast>): bool { match(i->data) { ast::_value(b) return true; } return false; }
fun is_top_level_item(i: *tree<ast>): bool { return i->parent == null<tree<ast>>() || is_translation_unit(i->parent); }
fun get_ancestor_satisfying(t: *tree<ast>, p: fun(*tree<ast>): bool): *tree<ast> {
t = t->parent
while (t != null<tree<ast>>() && !p(t))
t = t->parent
return t
}
fun make_ast_binding(s: *char): *tree<ast> {
return make_ast_binding(str(s))
}
fun make_ast_binding(s: str): *tree<ast> {
return make_ast_binding(s, vec<*binding<type>>())
}
fun make_ast_binding(s: str, v: vec<*binding<type>>): *tree<ast> {
return _binding(s, v, binding<tree<ast>>())
}
fun clone_ast_binding(binding: *tree<ast>): *tree<ast> {
match(binding->data) {
ast::_binding(b) {
return _binding(b.first, b.second, b.third)
}
}
error("trying to get binding on not a binding")
}
fun get_ast_binding_inst_types(binding: *tree<ast>): ref vec<*binding<type>> {
match(binding->data) {
ast::_binding(b) {
return b.second
}
}
error("trying to get binding on not a binding")
}
fun get_ast_binding(binding: *tree<ast>, epoch: binding_epoch): *tree<ast> {
match(binding->data) {
ast::_binding(b) {
return b.third->get_bound_to(epoch)
}
}
error("trying to get binding on not a binding")
}
fun set_ast_binding(binding: *tree<ast>, to: *tree<ast>, epoch: binding_epoch) {
match(binding->data) {
ast::_binding(b) {
b.third->set(to, epoch)
return
}
}
error("trying to set binding on not a binding")
}
fun set_single_ast_binding(binding: *tree<ast>, to: *tree<ast>, epoch: binding_epoch) {
match(binding->data) {
ast::_binding(b) {
b.third->set_single(to, epoch)
return
}
}
error("trying to set binding on not a binding")
}
fun ast_bound(binding: *tree<ast>): bool {
match(binding->data) {
ast::_binding(b) return b.third->bound()
}
error("Trying to check bound for not a binding")
}
fun ast_binding_str(binding: *tree<ast>): str {
match(binding->data) {
ast::_binding(b) return b.first
}
error("Trying to get name for not a binding")
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,111 +0,0 @@
import vec:*
import str:*
// for decent to string
// should be fixed by UFCS or decent scoping on template types
import ast:*
import type2:*
adt binding_epoch {
pre_ref,
post_ref,
all
}
var bindings: *vec<*void>
fun binding<T>(): *binding<T> {
return binding(null<T>(), binding_epoch::all())
}
fun binding_p<T>(it: T, epoch: binding_epoch): *binding<T> {
var p = new<T>()
p->copy_construct(&it)
return binding(p, epoch)
}
fun binding<T>(it: *T, epoch: binding_epoch): *binding<T> {
var to_ret = new<binding<T>>()->construct(it, epoch)
if (bindings == null<vec<*void>>())
bindings = new<vec<*void>>()->construct()
bindings->add( (to_ret) cast *void )
return to_ret
}
obj binding<T> (Object) {
var bound_to_pre_ref: *T
var bound_to_post_ref: *T
fun construct(): *binding<T> {
bound_to_pre_ref = null<T>()
bound_to_post_ref = null<T>()
return this
}
fun construct(it: *T, epoch: binding_epoch): *binding<T> {
bound_to_pre_ref = null<T>()
bound_to_post_ref = null<T>()
set_single(it, epoch)
return this
}
fun copy_construct(old: *binding<T>): void {
bound_to_pre_ref = old->bound_to_pre_ref
bound_to_post_ref = old->bound_to_post_ref
}
fun destruct() {
bound_to_pre_ref = null<T>()
bound_to_post_ref = null<T>()
}
fun bound(epoch: binding_epoch): bool {
return bound_to_pre_ref != null<T>() || bound_to_post_ref != null<T>()
}
fun set(to: T, epoch: binding_epoch) {
var p = new<T>()
p->copy_construct(&to)
set(p, epoch)
}
fun set(to: *T, epoch: binding_epoch) {
var pre_ref_from = bound_to_pre_ref
var post_ref_from = bound_to_post_ref
if epoch == binding_epoch::pre_ref() || epoch == binding_epoch::all() {
bound_to_pre_ref = to
// don't set null, that will set all unbound ones
if pre_ref_from != null<T>() {
for (var i = 0; i < bindings->size; i++;)
if ( ((bindings->get(i)) cast *binding<T>)->bound_to_pre_ref == pre_ref_from)
((bindings->get(i)) cast *binding<T>)->bound_to_pre_ref = to
}
}
if epoch == binding_epoch::post_ref() || epoch == binding_epoch::all() {
bound_to_post_ref = to
// don't set null, that will set all unbound ones
if post_ref_from != null<T>() {
for (var i = 0; i < bindings->size; i++;)
if ( ((bindings->get(i)) cast *binding<T>)->bound_to_post_ref == post_ref_from)
((bindings->get(i)) cast *binding<T>)->bound_to_post_ref = to
}
}
}
fun set_single(to: T, epoch: binding_epoch) {
var p = new<T>()
p->copy_construct(&to)
set_single(p, epoch)
}
fun set_single(to: *T, epoch: binding_epoch) {
match (epoch) {
binding_epoch::pre_ref() { bound_to_pre_ref = to; }
binding_epoch::post_ref() { bound_to_post_ref = to; }
binding_epoch::all() { bound_to_pre_ref = to; bound_to_post_ref = to; }
}
}
fun bound(): bool {
return bound_to_pre_ref != null<T>() || bound_to_post_ref != null<T>()
}
fun get_bound_to(epoch: binding_epoch): *T {
match (epoch) {
binding_epoch::pre_ref() { return bound_to_pre_ref; }
binding_epoch::post_ref() if bound_to_post_ref != null<T>() { return bound_to_post_ref; } else { return bound_to_pre_ref; }
binding_epoch::all() { error("trying to get_bound_to for all, which doesn't make any sense"); }
}
}
fun to_string(): str {
/*return "binding(" + to_string(bound_to) + ")"*/
return "binding(pre_ref:" + deref_to_string(bound_to_pre_ref) + "/post_ref:" + deref_to_string(bound_to_post_ref) + ")"
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,509 +0,0 @@
import io:*
import mem:*
import map:*
import hash_map:*
import stack:*
import str:*
import util:*
import tree:*
import symbol:*
import ast_nodes:*
// for error with syntax tree
import pass_common:*
import poset:*
obj c_generator (Object) {
var id_counter: int
var ast_name_map: hash_map<*ast_node, str>
var used_names: hash_set<str>
var function_type_map: map<type, str>
var replacement_map: map<str, str>
var longest_replacement: int
var function_typedef_string: str
var linker_string: str
fun construct(): *c_generator {
id_counter = 0
ast_name_map.construct()
used_names.construct()
// to avoid using c keywords
used_names.add(str("extern"))
used_names.add(str("register"))
function_type_map.construct()
function_typedef_string.construct()
linker_string.construct()
replacement_map.construct()
replacement_map[str("+")] = str("plus")
replacement_map[str("-")] = str("minus")
replacement_map[str("*")] = str("star")
replacement_map[str("/")] = str("div")
replacement_map[str("%")] = str("mod")
replacement_map[str("^")] = str("carat")
replacement_map[str("&")] = str("amprsd")
replacement_map[str("|")] = str("pipe")
replacement_map[str("~")] = str("tilde")
replacement_map[str("!")] = str("exlmtnpt")
replacement_map[str(",")] = str("comma")
replacement_map[str("=")] = str("eq")
replacement_map[str("++")] = str("dbplus")
replacement_map[str("--")] = str("dbminus")
replacement_map[str("<<")] = str("dbleft")
replacement_map[str(">>")] = str("dbright")
replacement_map[str("::")] = str("scopeop")
replacement_map[str(":")] = str("colon")
replacement_map[str("==")] = str("dbq")
replacement_map[str("!=")] = str("notequals")
replacement_map[str("&&")] = str("doubleamprsnd")
replacement_map[str("||")] = str("doublepipe")
replacement_map[str("+=")] = str("plusequals")
replacement_map[str("-=")] = str("minusequals")
replacement_map[str("/=")] = str("divequals")
replacement_map[str("%=")] = str("modequals")
replacement_map[str("^=")] = str("caratequals")
replacement_map[str("&=")] = str("amprsdequals")
replacement_map[str("|=")] = str("pipeequals")
replacement_map[str("*=")] = str("starequals")
replacement_map[str("<<=")] = str("doublerightequals")
replacement_map[str("<")] = str("lt")
replacement_map[str(">")] = str("gt")
replacement_map[str(">>=")] = str("doubleleftequals")
replacement_map[str("(")] = str("openparen")
replacement_map[str(")")] = str("closeparen")
replacement_map[str("[")] = str("obk")
replacement_map[str("]")] = str("cbk")
replacement_map[str(" ")] = str("_")
replacement_map[str(".")] = str("dot")
replacement_map[str("->")] = str("arrow")
longest_replacement = 0
replacement_map.for_each(fun(key: str, value: str) {
if (key.length() > longest_replacement)
longest_replacement = key.length()
})
return this
}
fun copy_construct(old: *c_generator) {
id_counter = old->id_counter
ast_name_map.copy_construct(&old->ast_name_map)
used_names.copy_construct(&old->used_names)
function_type_map.copy_construct(&old->function_type_map)
function_typedef_string.copy_construct(&old->function_typedef_string)
replacement_map.copy_construct(&old->replacement_map)
longest_replacement = old->longest_replacement
linker_string.copy_construct(&old->linker_string)
}
fun operator=(other: ref c_generator) {
destruct()
copy_construct(&other)
}
fun destruct() {
ast_name_map.destruct()
used_names.destruct()
function_type_map.destruct()
function_typedef_string.destruct()
replacement_map.destruct()
linker_string.destruct()
}
fun get_id(): str return to_string(id_counter++);
fun generate_function_prototype_and_header(child: *ast_node):pair<str,str> {
var backing = child->function
var parameter_types = str()
var parameters = str()
var decorated_name = str()
if (backing.is_extern)
decorated_name = backing.name
else
decorated_name = generate_function(child)
backing.parameters.for_each(fun(parameter: *ast_node) {
if (parameter_types != "") { parameter_types += ", "; parameters += ", ";}
parameter_types += type_to_c(parameter->identifier.type)
parameters += type_to_c(parameter->identifier.type) + " " + get_name(parameter)
})
if (backing.is_variadic) {
parameter_types += ", ..."
parameters += ", ..."
}
return make_pair(type_to_c(backing.type->return_type) + " " + decorated_name + "(" + parameter_types + ");\n",
type_to_c(backing.type->return_type) + " " + decorated_name + "(" + parameters + ")")
}
fun generate_c(name_ast_map: map<str, pair<*tree<symbol>,*ast_node>>, ast_to_syntax_in: map<*ast_node, *tree<symbol>> ): pair<str,str> {
var prequal: str = "#include <stdbool.h>\n"
var plain_typedefs: str = "\n/**Plain Typedefs**/\n"
var top_level_c_passthrough: str = ""
var variable_extern_declarations: str = ""
var structs: str = "\n/**Type Structs**/\n"
function_typedef_string = "\n/**Typedefs**/\n"
var function_prototypes: str = "\n/**Function Prototypes**/\n"
var function_definitions: str = "\n/**Function Definitions**/\n"
var variable_declarations: str = "\n/**Variable Declarations**/\n"
// moved out from below so that it can be used for methods as well as regular functions (and eventually lambdas...)
var generate_function_definition = fun(child: *ast_node) {
var backing = child->function
var prototype_and_header = generate_function_prototype_and_header(child)
function_prototypes += prototype_and_header.first
if (!backing.is_extern)
function_definitions += prototype_and_header.second
if (backing.body_statement) {
function_definitions += " {\n" + generate(backing.body_statement)
function_definitions += ";\n}\n"
}
}
var type_poset = poset<*ast_node>()
// iterate through asts
name_ast_map.for_each(fun(name: str, tree_pair: pair<*tree<symbol>,*ast_node>) {
// iterate through children for each ast
// do lambdas seperatly, so we can reconstitute the enclosing object if it has one
tree_pair.second->translation_unit.lambdas.for_each(fun(child: *ast_node) {
generate_function_definition(child)
})
tree_pair.second->translation_unit.children.for_each(fun(child: *ast_node) {
match (*child) {
ast_node::if_comp(backing) error("if_comp not currently supported")
ast_node::simple_passthrough(backing) error("simple_passthrough removed")
ast_node::declaration_statement(backing) variable_declarations += generate_declaration_statement(child) + ";\n" // false - don't do defer
// shouldn't need to do anything with return, as the intrinsic should be something like link
ast_node::compiler_intrinsic(backing) generate_compiler_intrinsic(child)
ast_node::function(backing) {
// check for and add to parameters if a closure
generate_function_definition(child)
}
ast_node::template(backing) {
backing.instantiated.for_each(fun(node: *ast_node) {
match (*node) {
ast_node::function(backing) generate_function_definition(node)
ast_node::type_def(backing) {
type_poset.add_job(node)
backing.variables.for_each(fun(i: *ast_node) {
var var_type = get_ast_type(i->declaration_statement.identifier)
if (!var_type->indirection && var_type->type_def)
type_poset.add_open_dep(node, var_type->type_def)
})
}
}
})
}
ast_node::type_def(backing) {
type_poset.add_job(child)
backing.variables.for_each(fun(i: *ast_node) {
var var_type = get_ast_type(i->declaration_statement.identifier)
if (!var_type->indirection && var_type->type_def)
type_poset.add_open_dep(child, var_type->type_def)
})
}
ast_node::adt_def(backing) error("ADT remaining!")
}
})
})
type_poset.get_sorted().for_each(fun(vert: *ast_node) {
var base_name = get_name(vert)
plain_typedefs += str("typedef ")
if (vert->type_def.is_union) {
plain_typedefs += "union "
structs += "union "
} else {
plain_typedefs += "struct "
structs += "struct "
}
plain_typedefs += base_name + "_dummy " + base_name + ";\n"
structs += base_name + "_dummy {\n"
vert->type_def.variables.for_each(fun(variable_declaration: *ast_node) structs += generate_declaration_statement(variable_declaration) + ";\n";)
// generate the methods (note some of these may be templates)
vert->type_def.methods.for_each(fun(method: *ast_node) {
if (is_template(method))
method->template.instantiated.for_each(fun(m: *ast_node) generate_function_definition(m);)
else
generate_function_definition(method);
})
structs += "};\n"
})
return make_pair(prequal+plain_typedefs+function_typedef_string+top_level_c_passthrough+variable_extern_declarations+structs+function_prototypes+variable_declarations+function_definitions + "\n", linker_string)
}
fun generate_declaration_statement(node: *ast_node): str {
var identifier = node->declaration_statement.identifier
var ident_type = identifier->identifier.type
var to_ret = type_to_c(identifier->identifier.type) + " " + get_name(identifier)
if (identifier->identifier.is_extern)
to_ret = "extern " + to_ret
if (node->declaration_statement.expression) {
// in case of recursive closures, make sure variable is declared before assignment
/*to_ret += ";\n"*/
/*to_ret += get_name(identifier) + " = " + generate(node->declaration_statement.expression)*/
to_ret += " = " + generate(node->declaration_statement.expression)
}
if (node->declaration_statement.init_method_call) {
error("init_method_call remaining")
}
return to_ret
}
fun generate_assignment_statement(node: *ast_node): str {
return generate(node->assignment_statement.to) + " = " + generate(node->assignment_statement.from)
}
fun generate_if_statement(node: *ast_node): str {
var if_str = "if (" + generate(node->if_statement.condition) + ") {\n" + generate(node->if_statement.then_part) + "}"
if (node->if_statement.else_part)
if_str += " else {\n" + generate(node->if_statement.else_part) + "}"
return if_str + "\n"
}
fun generate_while_loop(node: *ast_node): str {
return "while (" + generate(node->while_loop.condition) + ")\n" + generate(node->while_loop.statement)
}
fun generate_for_loop(node: *ast_node): str {
var init = str(";")
if (node->for_loop.init)
init = generate(node->for_loop.init)
var cond = str(";")
if (node->for_loop.condition)
cond = generate(node->for_loop.condition)
// gotta take off last semicolon
var update = str()
if (node->for_loop.update) {
update = generate(node->for_loop.update)
if (update.length() < 2)
error("update less than 2! Likely legal, but need easy compiler mod here")
update = update.slice(0,-2)
}
return "for (" + init + cond + "; " + update + ")\n" + generate(node->for_loop.body)
}
fun generate_identifier(node: *ast_node): str {
if (get_ast_type(node)->is_ref)
error("still existin ref in identifier")
return get_name(node)
}
fun generate_return_statement(node: *ast_node): str {
if (node->return_statement.return_value)
return "return " + generate(node->return_statement.return_value)
return str("return")
}
fun generate_branching_statement(node: *ast_node): str {
match(node->branching_statement.b_type) {
branching_type::break_stmt() return str("break")
branching_type::continue_stmt() return str("continue")
}
}
fun generate_cast(node: *ast_node): str {
return "((" + type_to_c(node->cast.to_type) + ")(" + generate(node->cast.value) + "))"
}
fun generate_value(node: *ast_node): str {
var value = node->value.string_value
if (node->value.value_type->base == base_type::character() && node->value.value_type->indirection == 0)
return "'" + value + "'"
if (node->value.value_type->base != base_type::character() || node->value.value_type->indirection != 1)
return value
var to_ret = str("\"") //"
value.for_each(fun(c: char) {
if (c == '\n')
to_ret += "\\n"
else if (c == '\\')
to_ret += "\\\\"
else if (c == '"')
to_ret += "\\\""
else
to_ret += c
})
return to_ret + "\""
}
fun generate_code_block(node: *ast_node): str {
var to_ret = str("{\n")
node->code_block.children.for_each(fun(child: *ast_node) to_ret += generate(child) + ";\n";)
return to_ret + "}"
}
// this generates the function as a value, not the actual function
fun generate_function(node: *ast_node): str {
return get_name(node)
}
fun generate_function_call(node: *ast_node): str {
var func_name = generate(node->function_call.func)
var call_string = str()
var func_return_type = get_ast_type(node)
var parameters = node->function_call.parameters
if ( parameters.size == 2 && (func_name == "+" || func_name == "-" || func_name == "*" || func_name == "/"
|| func_name == "<" || func_name == ">" || func_name == "<=" || func_name == ">="
|| func_name == "==" || func_name == "!=" || func_name == "%" || func_name == "^"
|| func_name == "|" || func_name == "&" || func_name == ">>" || func_name == "<<"
))
return "(" + generate(parameters[0]) + func_name + generate(parameters[1]) + ")"
if ( parameters.size == 2 && (func_name == "||" || func_name == "&&"))
error("Remaining || or &&")
// don't propegate enclosing function down right of access
// XXX what about enclosing object? should it be the thing on the left?
if (func_name == "." || func_name == "->")
return "(" + generate(parameters[0]) + func_name + generate(parameters[1]) + ")"
if (func_name == "[]")
return "(" + generate(parameters[0]) + "[" + generate(parameters[1]) + "])"
// the post ones need to be post-ed specifically, and take the p off
if (func_name == "++p" || func_name == "--p")
return "(" + generate(parameters[0]) + ")" + func_name.slice(0,-2)
// So we don't end up copy_constructing etc, we just handle the unary operators right here
if (func_name == "*" || func_name == "&")
return "(" + func_name + generate(parameters[0]) + ")"
var func_type = get_ast_type(node->function_call.func)
// regular parameter generation
for (var i = 0; i < parameters.size; i++;) {
var param = parameters[i]
var in_function_param_type = null<type>()
// grab type from param itself if we're out of param types (because variadic function)
if (i < func_type->parameter_types.size)
in_function_param_type = func_type->parameter_types[i]
else
in_function_param_type = get_ast_type(param)->clone_without_ref()
if (call_string != "")
call_string += ", "
call_string += generate(param)
}
call_string = func_name + "(" + call_string + ")"
return call_string
}
fun generate_compiler_intrinsic(node: *ast_node): str {
if (node->compiler_intrinsic.intrinsic == "sizeof") {
if (node->compiler_intrinsic.parameters.size || node->compiler_intrinsic.type_parameters.size != 1)
error("wrong parameters to sizeof compiler intrinsic")
return "sizeof(" + type_to_c(node->compiler_intrinsic.type_parameters[0]) + ")"
} else if (node->compiler_intrinsic.intrinsic == "link") {
node->compiler_intrinsic.parameters.for_each(fun(value: *ast_node) {
linker_string += str("-l") + value->value.string_value + " "
})
return str()
}
error(node->compiler_intrinsic.intrinsic + ": unknown intrinsic")
return str("ERROR")
}
fun generate(node: *ast_node): str {
if (!node) return str("/*NULL*/")
match (*node) {
ast_node::declaration_statement(backing) return generate_declaration_statement(node)
ast_node::assignment_statement(backing) return generate_assignment_statement(node)
ast_node::if_statement(backing) return generate_if_statement(node)
ast_node::while_loop(backing) return generate_while_loop(node)
ast_node::for_loop(backing) return generate_for_loop(node)
ast_node::function(backing) return generate_function(node)
ast_node::function_call(backing) return generate_function_call(node)
ast_node::compiler_intrinsic(backing) return generate_compiler_intrinsic(node)
ast_node::code_block(backing) return generate_code_block(node)
ast_node::return_statement(backing) return generate_return_statement(node)
ast_node::branching_statement(backing) return generate_branching_statement(node)
ast_node::defer_statement(backing) error("unremoved defer")
ast_node::match_statement(backing) error("unremoved match")
ast_node::cast(backing) return generate_cast(node)
ast_node::value(backing) return generate_value(node)
ast_node::identifier(backing) return generate_identifier(node)
}
error(str("COULD NOT GENERATE ") + get_ast_name(node))
return str("/* COULD NOT GENERATE */")
}
fun type_to_c(type: *type): str {
var indirection = str()
if (type->is_ref) error("still ref in type_to_c") //indirection += "/*ref*/ *"
for (var i = 0; i < type->indirection; i++;) indirection += "*"
match (type->base) {
base_type::none() return str("none") + indirection
base_type::template() return str("template") + indirection
base_type::template_type() return str("template_type") + indirection
base_type::void_return() return str("void") + indirection
base_type::boolean() return str("bool") + indirection
base_type::character() return str("char") + indirection
base_type::ucharacter() return str("unsigned char") + indirection
base_type::short_int() return str("short") + indirection
base_type::ushort_int() return str("unsigned short") + indirection
base_type::integer() return str("int") + indirection
base_type::uinteger() return str("unsigned int") + indirection
base_type::long_int() return str("long") + indirection
base_type::ulong_int() return str("unsigned long") + indirection
base_type::floating() return str("float") + indirection
base_type::double_precision() return str("double") + indirection
base_type::object() return get_name(type->type_def) + indirection
base_type::function() {
type = type->clone_with_indirection(0,false)
if (!function_type_map.contains_key(*type)) {
var temp_name = str("function") + get_id()
var temp = str()
type->parameter_types.for_each(fun(parameter_type: *type) {
temp += str(", ") + type_to_c(parameter_type) + " "
temp_name += "_" + cify_name(type_to_c(parameter_type))
})
if (type->is_raw)
function_typedef_string += str("typedef ") + type_to_c(type->return_type) + " (*" + temp_name + ")(" + temp.slice(1,-1) + ");\n"
else
error(type->to_string() + " is not raw!")
// again, the indirection
function_type_map[*type] = temp_name
}
return function_type_map[*type] + indirection
}
}
return str("impossible type") + indirection
}
fun type_decoration(type: *type): str {
return cify_name(type->to_string())
}
fun get_name(node: *ast_node): str {
var maybe_it = ast_name_map.get_ptr_or_null(node);
if (maybe_it)
return *maybe_it
var result = str("impossible name")
var make_unique = true
match (*node) {
ast_node::type_def(backing) {
var upper = backing.scope[str("~enclosing_scope")][0]
result = cify_name(backing.name)
if (is_template(upper))
upper->template.instantiated_map.reverse_get(node).for_each(fun(t: ref type) result += str("_") + type_decoration(&t);)
}
ast_node::function(backing) {
// be careful, operators like . come through this
if (backing.name == "main" || backing.is_extern || !backing.body_statement) {
result = backing.name
make_unique = false
} else {
result = "fun_"
var upper = backing.scope.get_with_default(str("~enclosing_scope"), vec(null<ast_node>()))[0]
if (upper && is_type_def(upper))
result += get_name(upper) + "_"
result += cify_name(node->function.name)
node->function.parameters.for_each(fun(param: *ast_node) result += str("_") + type_decoration(param->identifier.type);)
}
}
ast_node::identifier(backing) {
if (backing.name == "this" || backing.is_extern)
make_unique = false
result = backing.name
}
}
if (result == "impossible name")
error("HUGE PROBLEMS")
if (make_unique && used_names.contains(result))
result += get_id()
ast_name_map.set(node, result)
used_names.add(result)
return result
}
fun cify_name(name: str): str {
var to_ret = str()
for (var i = 0; i < name.length(); i++;) {
var replaced = false
for (var j = longest_replacement; j > 0; j--;) {
if (i + j <= name.length() && replacement_map.contains_key(name.slice(i,i+j))) {
to_ret += replacement_map[name.slice(i,i+j)]
replaced = true
i += j-1;
break
}
}
if (!replaced)
to_ret += name[i]
}
return to_ret
}
}

View File

@@ -1,37 +0,0 @@
import symbol:*
import tree:*
import vec:*
import map:*
import util:*
import str:*
import mem:*
import io:*
import ast_nodes:*
import ast_transformation:*
import pass_common:*
fun get_line(node: *tree<symbol>, name: str): *ast_node {
var to_ret = _passthrough()
to_ret->simple_passthrough.passthrough_str = str("\n#line ") + get_first_terminal(node)->data.position + " \"" + name + "\"\n"
return to_ret
}
fun c_line_control(name_ast_map: *map<str, pair<*tree<symbol>,*ast_node>>, ast_to_syntax: *map<*ast_node, *tree<symbol>>) {
var first = true
name_ast_map->for_each(fun(name: str, syntax_ast_pair: pair<*tree<symbol>,*ast_node>) {
/*var helper = fun(node: *ast_node, parent_chain: *stack<*ast_node>) {*/
/*match(*node) {*/
/*if (is_code_block(parent_chain->top()) && ast_to_syntax->contains_key(node)) {*/
/*println(str("adding ") + get_ast_name(node) + " to " + get_ast_name(parent))*/
/*add_before_in(get_line(ast_to_syntax->get(node), name), node, parent_chain->top())*/
/*}*/
/*}*/
/*}*/
/*if (first)*/
/*run_on_tree(helper, empty_pass_second_half(), syntax_ast_pair.second)*/
first = false
})
}

View File

@@ -1,46 +0,0 @@
import symbol:*
import tree:*
import vec:*
import map:*
import util:*
import str:*
import mem:*
import io:*
import ast_nodes:*
import ast_transformation:*
import interpreter:*
import hash_set:*
import pass_common:*
fun ctce_lower(name_ast_map: *map<str, pair<*tree<symbol>,*ast_node>>, ast_to_syntax: *map<*ast_node, *tree<symbol>>) {
var visited = hash_set<*ast_node>()
var globals = setup_globals(*name_ast_map)
var ctce_passes = vec<*ast_node>()
name_ast_map->for_each(fun(name: str, syntax_ast_pair: pair<*tree<symbol>,*ast_node>) {
var helper_before = fun(node: *ast_node, parent_chain: *stack<*ast_node>) {
match(*node) {
ast_node::compiler_intrinsic(backing) {
if (backing.intrinsic == "ctce") {
var result = evaluate_with_globals(backing.parameters[0], &globals)
*node = *unwrap_value(result)
} else if (backing.intrinsic == "ctce_pass") {
ctce_passes.add(backing.parameters[0])
remove(node, parent_chain)
}
}
}
}
run_on_tree(helper_before, empty_pass_second_half(), syntax_ast_pair.second, &visited)
})
ctce_passes.for_each(fun(func: *ast_node) {
// don't want to pick up the ast_node::value
var params = vec<interpreter::value>()
// easier to pick up types from the function itself
if (!is_function(func)) error(str("trying to CTCE pass with non function") + get_ast_name(func))
params.add(interpreter::value::pointer(make_pair((name_ast_map) cast *void, func->function.type->parameter_types[0])))
params.add(interpreter::value::pointer(make_pair((ast_to_syntax) cast *void, func->function.type->parameter_types[1])))
call_function(func, params, &globals)
})
}

View File

@@ -1,81 +0,0 @@
import symbol:*
import tree:*
import vec:*
import map:*
import util:*
import str:*
import mem:*
import io:*
import ast_nodes:*
import ast_transformation:*
import hash_set:*
import pass_common:*
fun defer_lower(name_ast_map: *map<str, pair<*tree<symbol>,*ast_node>>, ast_to_syntax: *map<*ast_node, *tree<symbol>>) {
var visited = hash_set<*ast_node>()
name_ast_map->for_each(fun(name: str, syntax_ast_pair: pair<*tree<symbol>,*ast_node>) {
var defer_triple_stack = stack<stack<stack<*ast_node>>>()
var loop_stack = stack(-1)
var helper_before = fun(node: *ast_node, parent_chain: *stack<*ast_node>) {
match(*node) {
ast_node::defer_statement(backing) {
if (is_code_block(parent_chain->top())) {
remove(node, parent_chain)
defer_triple_stack.top().top().push(backing.statement)
} else {
replace_with_in(node, backing.statement, parent_chain)
}
}
ast_node::code_block(backing) {
defer_triple_stack.top().push(stack<*ast_node>())
}
ast_node::for_loop(backing) {
loop_stack.push(defer_triple_stack.top().size())
}
ast_node::while_loop(backing) {
loop_stack.push(defer_triple_stack.top().size())
}
ast_node::function(backing) {
defer_triple_stack.push(stack<stack<*ast_node>>())
}
}
}
var helper_after = fun(node: *ast_node, parent_chain: *stack<*ast_node>) {
match(*node) {
ast_node::branching_statement(backing) {
var block = _code_block()
add_to_scope("~enclosing_scope", parent_chain->item_from_top_satisfying(fun(i: *ast_node): bool return is_code_block(i) || is_function(i);), block)
replace_with_in(node, block, parent_chain)
for (var i = 0; i < defer_triple_stack.top().size() - loop_stack.top(); i++;)
block->code_block.children.add_all(defer_triple_stack.top().from_top(i).reverse_vector())
block->code_block.children.add(node)
}
ast_node::return_statement(backing) {
var block = parent_chain->top()
if (!is_code_block(block))
error("defer doesn't have block - it should from obj lower")
for (var i = 0; i < defer_triple_stack.top().size(); i++;) {
defer_triple_stack.top().from_top(i).reverse_vector().for_each(fun(c: *ast_node) {
add_before_in(c, node, block)
})
}
}
ast_node::code_block(backing) {
node->code_block.children.add_all(defer_triple_stack.top().pop().reverse_vector())
}
ast_node::for_loop(backing) {
loop_stack.pop()
}
ast_node::while_loop(backing) {
loop_stack.pop()
}
ast_node::function(backing) {
defer_triple_stack.pop()
}
}
}
run_on_tree(helper_before, helper_after, syntax_ast_pair.second, &visited)
})
}

View File

@@ -1,329 +0,0 @@
import symbol:*
import tree:*
import vec:*
import map:*
import util:*
import type:*
import str:*
import mem:*
import io:*
import ast_nodes:*
import ast_transformation:*
import hash_set:*
import os:*
import pass_common:*
obj function_parent_block {
var function: *ast_node
var parent: *ast_node
var parent_block: *ast_node
var parent_function: *ast_node
}
fun make_function_parent_block(function: *ast_node, parent: *ast_node, parent_block: *ast_node, parent_function: *ast_node): function_parent_block {
var result: function_parent_block
result.function = function
result.parent = parent
result.parent_block = parent_block
result.parent_function = parent_function
return result
}
fun find_closed_variables(func: *ast_node, node: *ast_node): set<*ast_node> {
if (!node) return set<*ast_node>()
match (*node) {
ast_node::identifier(backing) {
if (!in_scope_chain(backing.enclosing_scope, func)) {
if (backing.name == "temporary_return_boomchaka" ||
backing.name == "temp_boom_return")
error("trying to close over temp return")
else
return set(node);
}
}
ast_node::code_block(backing) {
var to_ret = set<*ast_node>()
backing.children.for_each(fun(n: *ast_node) to_ret += find_closed_variables(func, n);)
return to_ret
}
ast_node::function_call(backing) {
if (is_function(backing.func) && (backing.func->function.name == "." || backing.func->function.name == "->"))
return find_closed_variables(func, backing.parameters.first())
var to_ret = find_closed_variables(func, backing.func)
backing.parameters.for_each(fun(n: *ast_node) to_ret += find_closed_variables(func, n);)
return to_ret
}
ast_node::function(backing) {
// if this is a lambda, we need to check all of the things it closes over
var to_ret = set<*ast_node>()
backing.closed_variables.for_each(fun(n: *ast_node) to_ret += find_closed_variables(func, n);)
return to_ret
}
ast_node::return_statement(backing) return find_closed_variables(func, backing.return_value)
ast_node::if_statement(backing) return find_closed_variables(func, backing.condition) + find_closed_variables(func, backing.then_part) + find_closed_variables(func, backing.else_part)
ast_node::match_statement(backing) {
var to_ret = set<*ast_node>()
backing.cases.for_each(fun(n: *ast_node) to_ret += find_closed_variables(func, n);)
return to_ret
}
ast_node::case_statement(backing) return find_closed_variables(func, backing.statement)
ast_node::while_loop(backing) return find_closed_variables(func, backing.condition) + find_closed_variables(func, backing.statement)
ast_node::for_loop(backing) {
return find_closed_variables(func, backing.init) + find_closed_variables(func, backing.condition) +
find_closed_variables(func, backing.update) + find_closed_variables(func, backing.body)
}
ast_node::return_statement(backing) return find_closed_variables(func, backing.return_value)
ast_node::defer_statement(backing) return find_closed_variables(func, backing.statement)
ast_node::assignment_statement(backing) return find_closed_variables(func, backing.to) + find_closed_variables(func, backing.from)
ast_node::declaration_statement(backing) return find_closed_variables(func, backing.expression) + find_closed_variables(func, backing.init_method_call)
ast_node::if_comp(backing) return find_closed_variables(func, backing.statement)
ast_node::cast(backing) return find_closed_variables(func, backing.value)
}
return set<*ast_node>()
}
fun in_scope_chain(node: *ast_node, high_scope: *ast_node): bool {
if (node == high_scope)
return true
if (get_ast_scope(node)->contains_key(str("~enclosing_scope")))
return in_scope_chain(get_ast_scope(node)->get(str("~enclosing_scope"))[0], high_scope)
return false
}
fun function_value_lower(name_ast_map: *map<str, pair<*tree<symbol>,*ast_node>>, ast_to_syntax: *map<*ast_node, *tree<symbol>>) {
var curr_time = get_time()
var visited = hash_set<*ast_node>()
var lambdas = set<*ast_node>()
name_ast_map->for_each(fun(name: str, syntax_ast_pair: pair<*tree<symbol>,*ast_node>) {
lambdas.add(syntax_ast_pair.second->translation_unit.lambdas)
// do in order so that inner lambdas are done before outer ones, so enclosed
// variables can propegate outwards
syntax_ast_pair.second->translation_unit.lambdas.for_each(fun(n: *ast_node) {
n->function.closed_variables = find_closed_variables(n, n->function.body_statement)
})
})
var all_types = hash_set<*type>()
var function_value_creation_points = vec<function_parent_block>()
var function_value_call_points = vec<function_parent_block>()
var closed_over_uses = vec<pair<*ast_node, pair<*ast_node, *ast_node>>>()
name_ast_map->for_each(fun(name: str, syntax_ast_pair: pair<*tree<symbol>,*ast_node>) {
var helper_before = fun(node: *ast_node, parent_chain: *stack<*ast_node>) {
var t = get_ast_type(node)
if (t) all_types.add(t)
match(*node) {
// gotta get #sizeof<function>
ast_node::compiler_intrinsic(c) c.type_parameters.for_each( fun(item: *type) all_types.add(item); )
ast_node::identifier(backing) {
// see if this identifier use is a closed variable in a closure
var enclosing_func = parent_chain->item_from_top_satisfying_or(fun(n: *ast_node): bool return is_function(n);, null<ast_node>())
if (enclosing_func && enclosing_func->function.closed_variables.contains(node)) {
closed_over_uses.add(make_pair(node, make_pair(parent_chain->top(), enclosing_func)))
}
}
ast_node::function(backing) {
var parent = parent_chain->top()
// need to use function value if
// it isn't a regular function definition (or lambda top reference)
var need_done = !is_translation_unit(parent) && !backing.type->is_raw
if (need_done) {
function_value_creation_points.add(make_function_parent_block(node, parent_chain->top(),
parent_chain->item_from_top_satisfying(fun(i: *ast_node): bool return is_code_block(i);),
parent_chain->item_from_top_satisfying(fun(i: *ast_node): bool return is_function(i);)
))
}
}
ast_node::function_call(backing) {
if (!get_ast_type(backing.func)->is_raw)
function_value_call_points.add(make_function_parent_block(backing.func, node, null<ast_node>(), null<ast_node>()))
}
}
}
run_on_tree(helper_before, empty_pass_second_half(), syntax_ast_pair.second, &visited)
})
curr_time = split(curr_time, "\tclosed_over_uses + function_value_call_points")
var void_ptr = type_ptr(base_type::void_return(), 1)
var lambda_type_to_struct_type_and_call_func = map<type, pair<*type, *ast_node>>(); //freaking vexing parse moved
all_types.chaotic_closure(fun(t: *type): set<*type> {
if (t->is_function())
return from_vector(t->parameter_types + t->return_type)
return set<*type>()
})
var all_type_values = all_types.map(fun(t: *type): type {
if (t->indirection != 0 || t->is_ref)
return *t->clone_with_indirection(0, false)
else
return *t
})
curr_time = split(curr_time, "\tall types/all type values")
all_type_values.for_each(fun(t: type) {
if (t.is_function() && t.indirection == 0 && !t.is_ref && !t.is_raw && !lambda_type_to_struct_type_and_call_func.contains_key(t)) {
var cleaned = t.clone()
cleaned->is_raw = true
var new_type_def_name = t.to_string() + "_function_value_struct"
var new_type_def = _type_def(new_type_def_name)
var func_ident = _ident("func", cleaned, new_type_def)
add_to_scope("func", func_ident, new_type_def)
var func_closure_type = cleaned->clone()
func_closure_type->parameter_types.add(0, type_ptr(base_type::void_return(), 1))
var func_closure_ident = _ident("func_closure", func_closure_type, new_type_def)
add_to_scope("func_closure", func_closure_ident, new_type_def)
var data_ident = _ident("data", void_ptr, new_type_def)
add_to_scope("data", data_ident, new_type_def)
new_type_def->type_def.variables.add(_declaration(func_ident, null<ast_node>()))
new_type_def->type_def.variables.add(_declaration(func_closure_ident, null<ast_node>()))
new_type_def->type_def.variables.add(_declaration(data_ident, null<ast_node>()))
add_to_scope("~enclosing_scope", name_ast_map->values.first().second, new_type_def)
add_to_scope(new_type_def_name, new_type_def, name_ast_map->values.first().second)
name_ast_map->values.first().second->translation_unit.children.add(new_type_def)
var lambda_struct_type = type_ptr(new_type_def)
var lambda_call_type = type_ptr(vec(lambda_struct_type) + t.parameter_types, t.return_type, 0, false, false, true)
// create parameters
var lambda_call_func_param = _ident("func_struct", lambda_struct_type, null<ast_node>())
var lambda_call_parameters = vec(lambda_call_func_param) + cleaned->parameter_types.map(fun(t:*type): *ast_node {
return _ident("pass_through_param", t, null<ast_node>())
})
var lambda_call_function = _function(str("lambda_call"), lambda_call_type, lambda_call_parameters, false)
// create call body with if, etc
var if_statement = _if(access_expression(lambda_call_func_param, "data"))
lambda_call_function->function.body_statement = _code_block(if_statement)
if_statement->if_statement.then_part = _code_block(_return(_func_call(access_expression(lambda_call_func_param, "func_closure"),
vec(access_expression(lambda_call_func_param, "data")) + lambda_call_parameters.slice(1,-1))))
if_statement->if_statement.else_part = _code_block(_return(_func_call(access_expression(lambda_call_func_param, "func"),
lambda_call_parameters.slice(1,-1))))
lambda_type_to_struct_type_and_call_func[t] = make_pair(lambda_struct_type, lambda_call_function)
// we have to add it for t and *cleaned since we might get either (we make the lambda's type raw later, so if used at creation point will be cleaned...)
// NOPE does this for other functions not lambdas super wrong
/*lambda_type_to_struct_type_and_call_func[*cleaned] = make_pair(lambda_struct_type, lambda_call_function)*/
name_ast_map->values.first().second->translation_unit.children.add(new_type_def)
name_ast_map->values.first().second->translation_unit.children.add(lambda_call_function)
}
})
curr_time = split(curr_time, "\tall type values forEach")
var lambda_creation_funcs = map<*ast_node, *ast_node>()
// create the closure type for each lambda
var closure_id = 0
lambdas.for_each(fun(l: *ast_node) {
var closure_struct_type: *type
if (l->function.closed_variables.size()) {
var new_type_def_name = str("closure_struct_") + closure_id++
var new_type_def = _type_def(new_type_def_name)
l->function.closed_variables.for_each(fun(v: *ast_node) {
// THIS MIGHT HAVE TO ACCOUNT FOR FUNC REFS
var closed_var_type = v->identifier.type
if (lambda_type_to_struct_type_and_call_func.contains_key(*closed_var_type))
closed_var_type = lambda_type_to_struct_type_and_call_func[*closed_var_type].first
var closed_ident = _ident(v->identifier.name, closed_var_type->clone_with_increased_indirection(), new_type_def)
new_type_def->type_def.variables.add(_declaration(closed_ident, null<ast_node>()))
add_to_scope(v->identifier.name, closed_ident, new_type_def)
})
add_to_scope("~enclosing_scope", name_ast_map->values.first().second, new_type_def)
add_to_scope(new_type_def_name, new_type_def, name_ast_map->values.first().second)
name_ast_map->values.first().second->translation_unit.children.add(new_type_def)
closure_struct_type = type_ptr(new_type_def)->clone_with_increased_indirection()
}
var return_type = lambda_type_to_struct_type_and_call_func[*l->function.type].first
var creation_type = type_ptr(vec<*type>(), return_type, 0, false, false, true)
lambda_creation_funcs[l] = _function(l->function.name + "_creation", creation_type, vec<*ast_node>(), false);
var body = _code_block()
var ident = _ident("to_ret", return_type, body)
body->code_block.children.add(_declaration(ident, null<ast_node>()))
body->code_block.children.add(_assign(access_expression(ident, "func"), l))
body->code_block.children.add(_assign(access_expression(ident, "func_closure"), l))
if (l->function.closed_variables.size()) {
var closure_lambda_param = _ident("closure_data_pass", closure_struct_type, l)
l->function.parameters.add(0, closure_lambda_param)
var closure_param = _ident("closure", closure_struct_type, body)
lambda_creation_funcs[l]->function.parameters.add(closure_param)
body->code_block.children.add(_assign(access_expression(ident, "data"), closure_param))
l->function.closed_variables.for_each(fun(v: *ast_node) {
// have to make sure to clean here as well
// THIS MIGHT HAVE TO ACCOUNT FOR FUNC REFS
var closed_param_type = v->identifier.type
if (lambda_type_to_struct_type_and_call_func.contains_key(*closed_param_type))
closed_param_type = lambda_type_to_struct_type_and_call_func[*closed_param_type].first
var closed_param = _ident("closed_param", closed_param_type->clone_with_increased_indirection(), l)
lambda_creation_funcs[l]->function.parameters.add(closed_param)
body->code_block.children.add(_assign(access_expression(closure_param, v->identifier.name), closed_param))
})
} else {
body->code_block.children.add(_assign(access_expression(ident, "data"), _value(str("0"), type_ptr(base_type::void_return(), 1))))
}
body->code_block.children.add(_return(ident))
lambda_creation_funcs[l]->function.body_statement = body
name_ast_map->values.first().second->translation_unit.children.add(lambda_creation_funcs[l])
})
curr_time = split(curr_time, "\tlambdas forEach")
function_value_call_points.for_each(fun(p: function_parent_block) {
// parent is the function call
var function_struct = p.function
var func_type = get_ast_type(p.function)
if (func_type->is_ref)
func_type = func_type->clone_without_ref()
p.parent->function_call.func = lambda_type_to_struct_type_and_call_func[*func_type].second
p.parent->function_call.parameters.add(0, function_struct)
})
curr_time = split(curr_time, "\tfunction_value_call_points.forEach")
function_value_creation_points.for_each(fun(p: function_parent_block) {
var lambda_creation_params = vec<*ast_node>()
// add the declaration of the closure struct to the enclosing code block
if (p.function->function.closed_variables.size()) {
// pull closure type off lambda creation func parameter
var closure_type = get_ast_type(lambda_creation_funcs[p.function]->function.parameters[0])->clone_with_decreased_indirection()
var closure_struct_ident = _ident("closure_struct", closure_type, p.parent_block)
p.parent_block->code_block.children.add(0,_declaration(closure_struct_ident, null<ast_node>()))
lambda_creation_params.add(make_operator_call("&", vec(closure_struct_ident)))
p.function->function.closed_variables.for_each(fun(v: *ast_node) {
var addr_of = make_operator_call("&", vec(v))
if (p.parent_function->function.closed_variables.contains(v)) {
closed_over_uses.add(make_pair(v, make_pair(addr_of, p.parent_function)))
}
lambda_creation_params.add(addr_of)
})
}
var func_call = _func_call(lambda_creation_funcs[p.function], lambda_creation_params)
replace_with_in(p.function, func_call, p.parent)
})
curr_time = split(curr_time, "\tfunction_value_creation_points.forEach")
lambdas.for_each(fun(l: *ast_node) l->function.type = l->function.type->clone();)
all_types.for_each(fun(t: *type) {
var t_nptr = t
if (t->indirection != 0 || t->is_ref) {
t_nptr = t->clone()
t_nptr->indirection = 0
t_nptr->is_ref = false
}
if (lambda_type_to_struct_type_and_call_func.contains_key(*t_nptr)) {
if (t_nptr != t)
*t = *lambda_type_to_struct_type_and_call_func[*t_nptr].first->clone_with_indirection(t->indirection, t->is_ref)
else
*t = *lambda_type_to_struct_type_and_call_func[*t_nptr].first
}
})
curr_time = split(curr_time, "\tlambdas.for_each")
closed_over_uses.for_each(fun(p: pair<*ast_node, pair<*ast_node, *ast_node>>) {
var variable = p.first
var parent = p.second.first
var lambda = p.second.second
var closure_param = lambda->function.parameters[0]
replace_with_in(variable, make_operator_call("*", vec(access_expression(closure_param, variable->identifier.name))), parent)
})
curr_time = split(curr_time, "\tclosed_over_uses")
// now we can make them raw
lambdas.for_each(fun(l: *ast_node) {
l->function.type->is_raw = true;
})
curr_time = split(curr_time, "\tlambdas is raw")
}

View File

@@ -1,86 +0,0 @@
import mem:*
__if_comp__ __C__ simple_passthrough(::"-pthread")
"""
#include <pthread.h>
"""
fun pthread_create(thrd : *ulong, strt_routine : fun(*void) : *void, input : *void) : int {
__if_comp__ __C__ {
simple_passthrough(thrd,strt_routine,input::) """
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
//int ret = pthread_create((pthread_t*)thrd, &attr, strt_routine.func, strt_routine.data);
int ret = pthread_create((pthread_t*)thrd, &attr, strt_routine.func, input);
pthread_attr_destroy(&attr);
return ret;
"""
}
return 0
}
fun pthread_join(thrd : *ulong) : int {
__if_comp__ __C__ { simple_passthrough(thrd::) """
pthread_t thread = *((pthread_t*)thrd);
return pthread_join(thread, NULL);
"""
}
return 0
}
fun pthread_exit() : *void {
__if_comp__ __C__ { simple_passthrough
"""
pthread_exit(NULL);
"""
}
}
fun future<T>(in : fun() : T ) : future<T> {
var out.construct(in) : future<T>
return out
}
obj func_res { var func : *void; var result : *void; }
obj future<T> {
var result : T
var status : int
var psy : fun() : T
var wrapper : fun(*void) : * void;
var thread : ulong
fun construct(in : fun() : T) : *future<T> {
status = 0
psy = in
wrapper = fun(in : *void) : *void {
var triple = (in) cast *func_res;
var func = (triple->func) cast *fun() : T;
var res : *T = (triple->result) cast *T;
(*res) = (*func)();
pthread_exit();
delete(in);
return null<void>();
}
return this
}
fun run() {
var in = new<func_res>();
in->result = (&result) cast *void;
in->func = (&psy) cast *void;
status = pthread_create(&thread,wrapper,(in) cast *void)
}
fun get_status():int {
return status
}
fun finish() : T {
pthread_join(&thread)
return result
}
}

View File

@@ -1,639 +0,0 @@
import str
import vec
import set
import stack
import map
import symbol
import regex
import io
import util
import serialize
fun split_into_words(gram_str: str::str): vec::vec<str::str> {
// var out.construct(): vec::vec<str>
var out.construct(): vec::vec<str::str>
var begin = 0
for (var i = 0; i < gram_str.length(); i++;) {
if (gram_str[i] == '#') {
while(gram_str[i] != '\n') i++
i++
io::print("comment: "); io::print(gram_str.slice(begin, i))
begin = i
}
if (gram_str[i] == '"') {
i++
while (gram_str[i] != '"') {
i++
// if we hit a " we check to see if an odd number of backslashes preceed it
// (meaning that the " is escaped), and if so, we move on. Otherwise, we found
// the end of the quoted str
if (gram_str[i] == '"') {
var escaped = 0
while (gram_str[i-(1+escaped)] == '\\') escaped++
if (escaped % 2)
i++
}
}
}
if (gram_str[i] == ' ') {
out.add(gram_str.slice(begin, i))
// allow multiple spaces between words
while (gram_str[i] == ' ') i++
begin = i
i--
}
if (gram_str[i] == '\n') {
if (i != begin)
out.add(gram_str.slice(begin, i))
begin = i + 1
}
}
return out
}
fun load_grammer(gram_str: str::str): grammer {
var gram.construct(): grammer
var leftSide = symbol::symbol("", false)
var doLeftSide = true
var rightSide = vec::vec<symbol::symbol>()
/*split_into_words(io::read_file(path)).for_each(fun(word: str::str) {*/
/*io::print("word: "); io::println(word);*/
/*})*/
/*return gram*/
split_into_words(gram_str).for_each(fun(word: str::str) {
/*io::print("word: "); io::println(word)*/
if (word == "=") {
// do nothing
} else if (word == "|") {
gram.rules.add(rule(leftSide, rightSide))
rightSide = vec::vec<symbol::symbol>()
} else if (word == ";") {
gram.rules.add(rule(leftSide, rightSide))
rightSide = vec::vec<symbol::symbol>()
doLeftSide = true
} else {
if (doLeftSide) {
leftSide = symbol::symbol(word, false)
gram.non_terminals.add(leftSide)
} else {
if (word[0] == '"') {
// ok, we support both plain terminals "hia*"
// and decorated terminals "hia*":hi_with_as
// so first check to find the ending " and see if it's
// the end of the str
var last_quote = word.length()-1
while(word[last_quote] != '"') last_quote--
if (last_quote != word.length()-1) {
rightSide.add(symbol::symbol(word.slice(last_quote+2, -1), true))
gram.terminals.add(util::make_pair(symbol::symbol(word.slice(last_quote+2, -1), true), regex::regex(word.slice(1,last_quote))))
} else {
rightSide.add(symbol::symbol(word, true))
gram.terminals.add(util::make_pair(symbol::symbol(word, true), regex::regex(word.slice(1,last_quote))))
}
} else {
var non_term = symbol::symbol(word, false)
rightSide.add(non_term)
gram.non_terminals.add(non_term)
}
}
doLeftSide = false
}
})
return gram
}
obj grammer (Object, Serializable) {
var rules: vec::vec<rule>
var non_terminals: set::set<symbol::symbol>
var terminals: vec::vec<util::pair<symbol::symbol, regex::regex>>
var first_set_map: map::map<symbol::symbol, set::set<symbol::symbol>>
var parse_table: table
fun construct(): *grammer {
rules.construct()
non_terminals.construct()
terminals.construct()
first_set_map.construct()
parse_table.construct()
}
fun copy_construct(old: *grammer) {
rules.copy_construct(&old->rules)
non_terminals.copy_construct(&old->non_terminals)
terminals.copy_construct(&old->terminals)
first_set_map.copy_construct(&old->first_set_map)
parse_table.copy_construct(&old->parse_table)
}
fun operator=(other: grammer) {
destruct()
copy_construct(&other)
}
fun destruct() {
rules.destruct()
non_terminals.destruct()
terminals.destruct()
first_set_map.destruct()
parse_table.destruct()
}
fun serialize(): vec::vec<char> {
return serialize::serialize(rules) + serialize::serialize(non_terminals) + serialize::serialize(terminals) + serialize::serialize(first_set_map) + serialize::serialize(parse_table)
}
fun unserialize(it: ref vec::vec<char>, pos: int): int {
// get everything constructed before the assignment
/*construct()*/
/*util::unpack(rules, pos) = serialize::unserialize<vec::vec<rule>>(it, pos)*/
/*util::unpack(non_terminals, pos) = serialize::unserialize<set::set<symbol::symbol>>(it, pos)*/
/*util::unpack(terminals, pos) = serialize::unserialize<vec::vec<util::pair<symbol::symbol, regex::regex>>>(it, pos)*/
/*util::unpack(first_set_map, pos) = serialize::unserialize<map::map<symbol::symbol, set::set<symbol::symbol>>>(it, pos)*/
/*util::unpack(parse_table, pos) = serialize::unserialize<table>(it, pos)*/
// do it in place. Actually looks nicer too
pos = rules.unserialize(it, pos)
pos = non_terminals.unserialize(it, pos)
pos = terminals.unserialize(it, pos)
pos = first_set_map.unserialize(it, pos)
pos = parse_table.unserialize(it, pos)
return pos
}
fun calculate_first_set() {
// the first set of a terminal is itself
terminals.for_each( fun(terminal: util::pair<symbol::symbol, regex::regex>)
first_set_map[terminal.first] = set::set(terminal.first)
)
// start out the non-terminals as empty sets
non_terminals.for_each( fun(non_terminal: symbol::symbol)
first_set_map[non_terminal] = set::set<symbol::symbol>()
)
var changed = true
while (changed) {
changed = false
rules.for_each( fun(r: rule) {
var rule_lookahead = first_vector(r.rhs)
if (!changed) {
changed = !first_set_map[r.lhs].contains(rule_lookahead)
}
first_set_map[r.lhs].add(rule_lookahead)
})
}
}
fun first_vector(rhs: ref vec::vec<symbol::symbol>): set::set<symbol::symbol> {
var toRet = set::set<symbol::symbol>()
if (rhs.size) {
for (var i = 0; i < rhs.size; i++;) {
var lookahead = first_set_map[rhs[i]]
if (lookahead.contains(symbol::null_symbol())) {
// remove the null if this is not the last in the rule
if (i != rhs.size-1)
lookahead.remove(symbol::null_symbol())
toRet.add(lookahead)
} else {
toRet.add(lookahead)
break
}
}
} else {
toRet.add(symbol::null_symbol())
}
return toRet
}
fun calculate_state_automaton() {
var first_state = closure(state(vec::vec(rules[0].with_lookahead(set::set(symbol::eof_symbol())))))
var states = vec::vec(first_state) // vec instead of set because we need to iterate by index
var newItems = stack::stack(0) // 0 is the index of the first and only item in states
var count = 0
while (newItems.size()) {
if (count%200 == 0) {
io::print("calculate_state_automaton while")
io::println(count)
}
count++
var I = newItems.pop()
var possGoto = set::set<symbol::symbol>()
states[I].items.for_each(fun(r: ref rule) {
if (!r.at_end())
possGoto.add(r.next())
// if r is at end or the rest reduces to null, add a reduce for each lookahead symbol
if ( r.at_end() || first_vector(r.after()).contains(symbol::null_symbol()) ) {
var rule_no = rules.find(r.plain())
r.lookahead.for_each(fun(sym: ref symbol::symbol) {
parse_table.add_reduce(I, sym, rule_no, r.position)
})
}
})
possGoto.for_each(fun(X: ref symbol::symbol) {
var goneState = goto(states[I], X)
if (goneState.items.size) {
var already_state = states.find(goneState)
if (already_state == -1) {
parse_table.add_push(I, X, states.size)
newItems.push(states.size)
states.add(goneState)
} else {
parse_table.add_push(I, X, already_state)
}
}
})
}
/*io::println("ALL STATES:\n")*/
/*states.for_each(fun(i: ref state) {*/
/*io::println("STATE:\n")*/
/*i.items.for_each(fun(r: ref rule) {*/
/*io::println(str::str("\t") + r.to_string())*/
/*})*/
/*})*/
io::println(" there were : states")
io::println(states.size)
/*io::println(" there were : table")*/
/*io::println(parse_table.to_string())*/
/*parse_table.print_string()*/
}
fun closure(initial: ref state): state {
initial.items = closure(initial.items)
return initial
}
fun closure(initial: ref vec::vec<rule>): vec::vec<rule> {
var continueIt = true
//var count = 0
while (continueIt) {
//io::print("closure while")
//io::println(count)
//count++
continueIt = false
for (var i = 0; i < initial.size; i++;) {
if (initial[i].at_end()) {
continue
}
rules.for_each(fun(r: ref rule) {
// if i is |a::=c . Bb, a|, we're doing each B::=... in rules
if (r.lhs != initial[i].next())
return // continue the for-each
// add r with lookahead
var newLookahead = first_vector(initial[i].after_next())
if (newLookahead.contains(symbol::null_symbol())) {
newLookahead.remove(symbol::null_symbol())
newLookahead.add(initial[i].lookahead)
}
var alreadyInInSomeForm = false
for (var index = 0; index < initial.size; index++;) {
if (initial[index].equals_but_lookahead(r)) {
alreadyInInSomeForm = true
if (!initial[index].lookahead.contains(newLookahead)) {
//io::println("\n\n\n")
//io::println(initial[index].to_string())
//io::println("and")
//io::println(r.to_string())
//io::println("with")
//var result = str::str("|lookahead {")
//newLookahead.for_each(fun(i: symbol::symbol) {
//result += i.to_string()
//})
//io::println(result)
//io::println("are the same with different lookaheads")
initial[index].lookahead += newLookahead
//io::println("so now it's")
//io::println(initial[index].to_string())
//io::println("contineu because equal_but_different")
continueIt = true
return // continue the rules for-each
}
}
}
if (!alreadyInInSomeForm) {
continueIt = true
//io::println("\n\n\n")
//io::println("contineu because not contains")
//io::println(newRule.to_string())
initial.add(r.with_lookahead(newLookahead))
}
})
}
}
return initial
}
fun goto(I: ref state, X: ref symbol::symbol): state {
// loop through i, find all that have thing::= something . X more,
// add thing ::= something X . more
var jPrime = vec::vec<rule>()
I.items.for_each(fun(i: ref rule) {
if (!i.at_end() && i.next() == X)
jPrime.add(i.advanced())
})
// return closure(that)?
return state(closure(jPrime))
}
fun to_string(): str::str {
var result = str::str("grammer rules:")
rules.for_each( fun(i : rule) { result += str::str("\n\t") + i.to_string(); } )
result += "\nnon_terminals:"
non_terminals.for_each( fun(i : symbol::symbol) { result += str::str("\n\t") + i.to_string(); } )
result += "\nterminals:"
terminals.for_each( fun(i : util::pair<symbol::symbol, regex::regex>) { result += str::str("\n\t") + i.first.to_string() + ": " + i.second.regexString; } )
return result
}
}
fun rule(lhs: symbol::symbol, rhs: vec::vec<symbol::symbol>): rule {
var toRet.construct(): rule
toRet.lhs = lhs
toRet.rhs = rhs
return toRet
}
obj rule (Object, Serializable) {
var lhs: symbol::symbol
var rhs: vec::vec<symbol::symbol>
var position: int
var lookahead: set::set<symbol::symbol>
fun serialize(): vec::vec<char> {
return serialize::serialize(lhs) + serialize::serialize(rhs) + serialize::serialize(position) + serialize::serialize(lookahead)
}
fun unserialize(it: ref vec::vec<char>, pos: int): int {
/*var tempLhs = symbol::invalid_symbol()*/
/*var tempRhs = vec::vec<symbol::symbol>()*/
/*var tempLookahead = set::set<symbol::symbol>()*/
/*util::unpack(tempLhs, pos) = serialize::unserialize<symbol::symbol>(it, pos)*/
/*util::unpack(tempRhs, pos) = serialize::unserialize<vec::vec<symbol::symbol>>(it, pos)*/
/*util::unpack(position, pos) = serialize::unserialize<int>(it, pos)*/
/*util::unpack(tempLookahead, pos) = serialize::unserialize<set::set<symbol::symbol>>(it, pos)*/
/*lhs.copy_construct(&tempLhs)*/
/*rhs.copy_construct(&tempRhs)*/
/*lookahead.copy_construct(&tempLookahead)*/
pos = lhs.unserialize(it, pos)
pos = rhs.unserialize(it, pos)
util::unpack(position, pos) = serialize::unserialize<int>(it, pos)
return lookahead.unserialize(it, pos)
}
fun construct(): *rule {
lhs.construct()
rhs.construct()
position = 0
lookahead.construct()
}
fun copy_construct(other: *rule) {
lhs.copy_construct(&other->lhs)
rhs.copy_construct(&other->rhs)
position = other->position
lookahead.copy_construct(&other->lookahead)
}
fun operator=(other: rule) {
destruct()
copy_construct(&other)
}
fun operator==(other: ref rule):bool {
return lhs == other.lhs && rhs == other.rhs &&
position == other.position && lookahead == other.lookahead
}
fun equals_but_lookahead(other: ref rule):bool {
return lhs == other.lhs && rhs == other.rhs &&
position == other.position
}
fun destruct() {
lhs.destruct()
rhs.destruct()
lookahead.destruct()
}
fun next(): ref symbol::symbol {
return rhs[position]
}
fun after(): vec::vec<symbol::symbol> {
return rhs.slice(position, -1)
}
fun after_next(): vec::vec<symbol::symbol> {
return rhs.slice(position + 1, -1)
}
fun at_end(): bool {
return position >= rhs.size
}
fun plain(): rule {
return rule(lhs, rhs)
}
fun with_lookahead(newLookahead: set::set<symbol::symbol>): rule {
var toRet = rule(lhs, rhs)
toRet.position = position
toRet.lookahead = newLookahead
return toRet
}
fun advanced(): rule {
var toRet = rule(lhs, rhs)
toRet.position = position+1
toRet.lookahead = lookahead
return toRet
}
fun to_string(): str::str {
var result = lhs.name + " -> "
for (var i = 0; i < rhs.size; i++;)
if (i == position)
result += str::str(" . ") + rhs[i].to_string() + ", ";
else
result += rhs[i].to_string() + ", ";
if (position == rhs.size)
result += " . "
result += "|lookahead {"
lookahead.for_each(fun(i: symbol::symbol) {
result += i.to_string()
})
result += "}"
return result
}
}
fun state(itemsIn: ref vec::vec<rule>): state {
var toRet.construct(itemsIn): state
return toRet
}
obj state (Object) {
var items: vec::vec<rule>
fun construct(): *state {
items.construct()
}
fun construct(itemsIn: ref vec::vec<rule>): *state {
items.copy_construct(&itemsIn)
}
fun copy_construct(other: *state) {
items.copy_construct(&other->items)
}
fun operator=(other: state) {
destruct()
copy_construct(&other)
}
fun destruct() {
items.destruct()
}
fun operator==(other: ref state):bool {
return items == other.items
}
fun to_string(): str::str {
return str::str("woo a state")
}
}
adt action_type {
push,
reduce,
// note that these two are not actually currently used
// accept is the reduce of the goal rule and reject is the
// absence of actions
accept,
reject,
invalid
}
fun action(act: action_type, state_or_rule: int): action {
var toRet: action
toRet.act = act
toRet.state_or_rule = state_or_rule
toRet.rule_position = -1
return toRet
}
fun action(act: action_type, state_or_rule: int, rule_position: int): action {
var toRet: action
toRet.act = act
toRet.state_or_rule = state_or_rule
toRet.rule_position = rule_position
return toRet
}
obj action {
var act: action_type
var state_or_rule: int // sigh
var rule_position: int // sigh
fun operator==(other: action): bool {
return act == other.act && state_or_rule == other.state_or_rule && rule_position == other.rule_position
}
fun print() {
match (act) {
action_type::push()
io::print("push ")
action_type::reduce()
io::print("reduce ")
action_type::accept()
io::print("accept ")
action_type::reject()
io::print("reject ")
}
/*if (act == action_type::push)*/
/*io::print("push ")*/
/*else if (act == action_type::reduce)*/
/*io::print("reduce ")*/
/*else if (act == action_type::accept)*/
/*io::print("accept ")*/
/*else if (act == action_type::reject)*/
/*io::print("reject ")*/
io::print(state_or_rule)
io::print(" ")
io::print(rule_position)
io::println()
}
}
obj table (Object, Serializable) {
// a 2 dimensional table made of a vec and a map that maps from stateno & symbol to a vec of parse actions
var items: vec::vec<map::map<symbol::symbol, vec::vec<action>>>
fun construct(): *table {
items.construct()
}
fun copy_construct(other: *table) {
items.copy_construct(&other->items)
}
fun operator=(other: table) {
destruct()
copy_construct(&other)
}
fun destruct() {
items.destruct()
}
fun serialize(): vec::vec<char> {
return serialize::serialize(items)
}
fun unserialize(it: ref vec::vec<char>, pos: int): int {
/*construct()*/
/*util::unpack(items, pos) = serialize::unserialize<vec::vec<map::map<symbol::symbol, vec::vec<action>>>>(it, pos)*/
pos = items.unserialize(it, pos)
return pos
}
fun expand_to(include_state: int) {
while (include_state >= items.size)
items.addEnd(map::map<symbol::symbol, vec::vec<action>>())
}
// we always "clean" the symbol before using it so that having different data doesn't
// prevent us from finding the symbol in the table
fun clean_symbol(sym: ref symbol::symbol): symbol::symbol {
return symbol::symbol(sym.name, sym.terminal)
}
fun add_push(from_state: int, on_symbol: ref symbol::symbol, to_state: int) {
expand_to(from_state)
var cleaned_symbol = clean_symbol(on_symbol)
if (items[from_state].contains_key(cleaned_symbol))
items[from_state][cleaned_symbol].addEnd(action(action_type::push(), to_state))
else
items[from_state].set(cleaned_symbol, vec::vec(action(action_type::push(), to_state)))
}
fun add_reduce(from_state: int, on_symbol: ref symbol::symbol, by_rule_no: int, rule_position: int) {
expand_to(from_state)
var cleaned_symbol = clean_symbol(on_symbol)
if (items[from_state].contains_key(cleaned_symbol))
items[from_state][cleaned_symbol].addEnd(action(action_type::reduce(), by_rule_no, rule_position))
else
items[from_state].set(cleaned_symbol, vec::vec(action(action_type::reduce(), by_rule_no, rule_position)))
}
fun add_accept(from_state: int, on_symbol: ref symbol::symbol) {
expand_to(from_state)
var cleaned_symbol = clean_symbol(on_symbol)
if (items[from_state].contains_key(cleaned_symbol))
items[from_state][cleaned_symbol].addEnd(action(action_type::accept(), 0))
else
items[from_state].set(cleaned_symbol, vec::vec(action(action_type::accept(), 0)))
}
fun get(state: int, on_symbol: ref symbol::symbol): vec::vec<action> {
var cleaned_symbol = clean_symbol(on_symbol)
if (items[state].contains_key(cleaned_symbol))
return items[state][cleaned_symbol]
return vec::vec<action>()
}
fun get_shift(state: int, on_symbol: ref symbol::symbol): action {
var actions = get(state, on_symbol)
for (var i = 0; i < actions.size; i++;)
if (actions[i].act == action_type::push())
return actions[i]
io::println("tried to get a shift when none existed")
io::print("for state ")
io::print(state)
io::print(" and symbol ")
io::println(on_symbol.to_string())
return action(action_type::invalid(),-1)
}
fun get_reduces(state: int, on_symbol: ref symbol::symbol): vec::vec<action> {
return get(state, on_symbol).filter(fun(act: action):bool { return act.act == action_type::reduce(); })
}
fun print_string() {
/*return str::str("woo a table of size: ") + items.size*/
io::print("woo a table of size: ")
io::println(items.size)
for (var i = 0; i < items.size; i++;) {
io::print("for state: ")
io::println(i)
items[i].for_each(fun(sym: symbol::symbol, actions: vec::vec<action>) {
actions.for_each(fun(action: action) {
io::print("\ton symbol: ")
io::print(sym.to_string())
io::print(" do action: ")
action.print()
})
})
}
}
}

View File

@@ -1,120 +0,0 @@
import vec
import map
import io
import serialize
import util
fun hash_map<T,U>(): hash_map<T,U> {
var toRet.construct(): hash_map<T,U>
return toRet
}
fun hash_map<T,U>(key: ref T, value: ref U): hash_map<T,U> {
var toRet.construct(): hash_map<T,U>
toRet.set(key, value)
return toRet
}
obj hash_map<T,U> (Object, Serializable) {
var data: vec::vec<map::map<T,U>>
var size: int
fun construct(): *hash_map<T,U> {
data.construct()
data.add(map::map<T,U>())
size = 0
return this
}
fun copy_construct(old: *hash_map<T,U>) {
data.copy_construct(&old->data)
size = old->size
}
fun operator=(rhs: ref hash_map<T,U>) {
data = rhs.data
size = rhs.size
}
fun destruct() {
data.destruct()
}
fun serialize(): vec::vec<char> {
return serialize::serialize(data) + serialize::serialize(size)
}
fun unserialize(it: ref vec::vec<char>, pos: int): int {
pos = data.unserialize(it, pos)
util::unpack(size, pos) = serialize::unserialize<int>(it, pos)
return pos
}
// the old unnecessary template to prevent generation
// if not used trick (in this case, changing out U with V)
fun operator==<V>(other: ref hash_map<T,V>): bool {
return data == other.data
}
fun set(key: ref T, value: ref U) {
var key_hash = util::hash(key)
if (!data[(key_hash%data.size) cast int].contains_key(key)) {
size++
if (size > data.size) {
var new_data.construct(size*2): vec::vec<map::map<T,U>>
for (var i = 0; i < size*2; i++;)
new_data.addEnd(map::map<T,U>())
for_each(fun(key: T, value: U) {
new_data[(util::hash(key)%new_data.size) cast int].set(key, value)
})
data.swap(new_data)
}
}
data[(key_hash%data.size) cast int].set(key, value)
}
fun get(key: ref T): ref U {
return data[(util::hash(key)%data.size) cast int].get(key)
}
fun get_ptr_or_null(key: ref T): *U {
return data[(util::hash(key)%data.size) cast int].get_ptr_or_null(key)
}
fun contains_key(key: ref T): bool {
return data[(util::hash(key)%data.size) cast int].contains_key(key)
}
fun contains_value(value: ref U): bool {
for (var i = 0; i < data.size; i++;) {
if (data[i].contains_value(value))
return true
}
return false
}
fun reverse_get(value: ref U): ref T {
for (var i = 0; i < data.size; i++;) {
if (data[i].contains_value(value))
return data[i].reverse_get(value)
}
io::println("trying to reverse get a value that is not in the hash_map")
}
fun remove(key: ref T) {
data[(util::hash(key)%data.size) cast int].remove(key)
}
fun for_each(func: fun(T, U):void) {
for (var i = 0; i < data.size; i++;)
data[i].for_each(func)
}
fun operator[](key: ref T): ref U {
return get(key)
}
fun operator[]=(key: ref T, value: ref U) {
set(key,value)
}
fun get_with_default(key: ref T, default_val: ref U): ref U {
if (contains_key(key))
return get(key)
return default_val
}
fun clear() {
data.clear()
size = 0
data.add(map::map<T,U>())
}
fun pop(): util::pair<T,U> {
for (var i = 0; i < data.size; i++;)
if (data[i].size() > 0)
return data[i].pop()
io::println("trying to pop out of an empty hash_map")
}
}

View File

@@ -1,147 +0,0 @@
import hash_map
import vec
import io
import serialize
import set
fun hash_set<T>(): hash_set<T> {
var toRet.construct() : hash_set<T>
return toRet
}
fun hash_set<T>(item: T): hash_set<T> {
var toRet.construct() : hash_set<T>
toRet.add(item)
return toRet
}
fun from_vector<T>(items: vec::vec<T>): hash_set<T> {
var toRet.construct() : hash_set<T>
items.for_each( fun(item: T) toRet.add(item); )
return toRet
}
obj hash_set<T> (Object, Serializable) {
var data: hash_map::hash_map<T,bool>
fun construct(): *hash_set<T> {
data.construct()
return this
}
/*fun construct(ammt: int): *hash_set<T> {*/
/*data.construct(ammt)*/
/*return this*/
/*}*/
fun copy_construct(old: *hash_set<T>) {
data.copy_construct(&old->data)
}
fun operator=(rhs: ref hash_set<T>) {
data = rhs.data
}
fun serialize(): vec::vec<char> {
return serialize::serialize(data)
}
fun unserialize(it: ref vec::vec<char>, pos: int): int {
return data.unserialize(it, pos)
}
// the old unnecessary template to prevent generation
// if not used trick (in this case, changing out U with V)
fun operator==<V>(other: ref hash_set<V>): bool {
return data == other.data
}
fun operator!=(rhs: ref hash_set<T>): bool {
return ! (*this == rhs)
}
fun destruct() {
data.destruct()
}
fun size():int {
return data.size
}
/*fun contains(items: ref hash_set<T>): bool {*/
/*return items.size() == 0 || !items.any_true( fun(item: T): bool return !contains(item); )*/
/*}*/
fun contains(item: ref T): bool {
return data.contains_key(item)
}
fun operator+=(item: ref T) {
add(item)
}
/*fun operator+=(items: ref hash_set<T>) {*/
/*add(items)*/
/*}*/
/*fun operator+(items: ref hash_set<T>): hash_set<T> {*/
/*var to_ret.copy_construct(this): hash_set<T>*/
/*to_ret.add(items)*/
/*return to_ret*/
/*}*/
fun add(item: ref T) {
if (!contains(item))
data.set(item,true)
}
/*fun add_all(items: ref hash_set<T>) {*/
/*add(items)*/
/*}*/
/*fun add(items: ref hash_set<T>) {*/
/*items.for_each( fun(item: ref T) add(item); )*/
/*}*/
fun remove(item: ref T) {
data.remove(item)
}
/*fun for_each(func: fun(ref T):void) {*/
/*data.for_each(func)*/
/*}*/
fun for_each(func: fun(T):void) {
data.for_each(fun(key: T, value: bool) { func(key); })
}
fun map<U>(func: fun(T):U): set::set<U> {
var newSet.construct(size()): set::set<U>
for_each(fun(i: T) { newSet.add(func(i)); })
return newSet
}
/*fun any_true(func: fun(T):bool):bool {*/
/*return data.any_true(func)*/
/*}*/
/*fun reduce<U>(func: fun(T,U): U, initial: U): U {*/
/*return data.reduce(func, initial)*/
/*}*/
/*fun flatten_map<U>(func: fun(T):hash_set<U>):hash_set<U> {*/
/*var newSet.construct(size()): hash_set<U>*/
/*for (var i = 0; i < size(); i++;)*/
/*func(data[i]).for_each(fun(item: ref U) newSet.add(item);)*/
/*return newSet*/
/*}*/
/*fun filter(func: fun(T):bool):hash_set<T> {*/
/*var newSet.construct(): hash_set<T>*/
/*newSet.data = data.filter(func)*/
/*return newSet*/
/*}*/
fun chaotic_closure(func: fun(T): set::set<T>) {
var prev_size = 0
while (prev_size != size()) {
prev_size = size()
var to_add.construct(size()): vec::vec<T>
for_each(fun(i: T) {
func(i).for_each(fun(j: T) { to_add.add(j); })
})
to_add.for_each(fun(i:T) { add(i); })
}
}
fun pop(): T {
return data.pop().first
}
fun union(other: hash_set<T>): hash_set<T> {
for_each(fun(i: T) {
other.add(i)
})
return other
}
fun operator-(items: ref hash_set<T>): hash_set<T> {
var to_ret.copy_construct(this): hash_set<T>
items.for_each(fun(i: T) {
to_ret.remove(i)
})
return to_ret
}
}

View File

@@ -1,134 +0,0 @@
import symbol:*
import tree:*
import vec:*
import stack:*
import map:*
import util:*
import str:*
import mem:*
import io:*
import ast_nodes:*
import ast_transformation:*
import parser:*
fun import(file_name: str, parsers: ref vec<parser>, ast_pass: ref ast_transformation, import_paths: vec<str>): map<str, pair<*tree<symbol>,*ast_node>> {
var name_ast_map = map<str, pair<*tree<symbol>,*ast_node>>()
// lambda closes over our fix-up list
var imports_to_fix = vec<*ast_node>()
var import_first_pass = fun(file_name_idx: pair<str,int>) {
var file_name = file_name_idx.first
var file = str()
import_paths.for_each(fun(path: str) {
if (file_exists(path + file_name)) {
file = read_file(path + file_name)
} else {
}
})
printerr(file_name + ", ")
var parse_tree = parsers[file_name_idx.second].parse_input(file, file_name)
trim(parse_tree)
var ast_and_imports = ast_pass.first_pass(file_name, parse_tree)
imports_to_fix += ast_and_imports.second
name_ast_map[file_name] = make_pair(parse_tree, ast_and_imports.first)
}
printlnerr("**First Pass**")
printerr("parsing: ")
import_first_pass(make_pair(file_name,0))
for (var i = 0; i < imports_to_fix.size; i++;) {
var import_name = imports_to_fix[i]->import.name
var file_name = import_name + ".krak"
if (!name_ast_map.contains_key(file_name)) {
import_first_pass(make_pair(file_name,0))
}
var im = imports_to_fix[i]
var file_name = import_name + ".krak"
im->import.translation_unit = name_ast_map[file_name].second
add_to_scope(import_name, im->import.translation_unit, im->import.containing_translation_unit)
}
printlnerr()
printlnerr("**Second Pass**")
name_ast_map.for_each(fun(name: str, tree_pair: pair<*tree<symbol>, *ast_node>) ast_pass.second_pass(tree_pair.first, tree_pair.second);)
printlnerr("**Third Pass**")
name_ast_map.for_each(fun(name: str, tree_pair: pair<*tree<symbol>, *ast_node>) ast_pass.third_pass(tree_pair.first, tree_pair.second);)
printlnerr("**Fourth Pass**")
name_ast_map.for_each(fun(name: str, tree_pair: pair<*tree<symbol>, *ast_node>) ast_pass.fourth_pass(tree_pair.first, tree_pair.second);)
return name_ast_map
}
fun trim(parse_tree: *tree<symbol>) {
remove_node(symbol("$NULL$", false), parse_tree)
remove_node(symbol("WS", false), parse_tree)
// the terminals have " around them, which we have to escape
remove_node(symbol("\"\\(\"", true), parse_tree)
remove_node(symbol("\"\\)\"", true), parse_tree)
remove_node(symbol("\"template\"", true), parse_tree)
remove_node(symbol("\"return\"", true), parse_tree)
remove_node(symbol("\"defer\"", true), parse_tree)
remove_node(symbol("\";\"", true), parse_tree)
remove_node(symbol("line_end", false), parse_tree)
remove_node(symbol("\"{\"", true), parse_tree)
remove_node(symbol("\"}\"", true), parse_tree)
remove_node(symbol("\"(\"", true), parse_tree)
remove_node(symbol("\")\"", true), parse_tree)
remove_node(symbol("\"if\"", true), parse_tree)
remove_node(symbol("\"while\"", true), parse_tree)
remove_node(symbol("\"__if_comp__\"", true), parse_tree)
remove_node(symbol("\"comp_simple_passthrough\"", true), parse_tree)
/*remove_node(symbol("obj_nonterm", false), parse_tree)*/
remove_node(symbol("adt_nonterm", false), parse_tree)
collapse_node(symbol("case_statement_list", false), parse_tree)
collapse_node(symbol("opt_param_assign_list", false), parse_tree)
collapse_node(symbol("param_assign_list", false), parse_tree)
collapse_node(symbol("opt_typed_parameter_list", false), parse_tree)
collapse_node(symbol("opt_parameter_list", false), parse_tree)
collapse_node(symbol("intrinsic_parameter_list", false), parse_tree)
collapse_node(symbol("identifier_list", false), parse_tree)
collapse_node(symbol("adt_option_list", false), parse_tree)
collapse_node(symbol("statement_list", false), parse_tree)
collapse_node(symbol("parameter_list", false), parse_tree)
collapse_node(symbol("typed_parameter_list", false), parse_tree)
collapse_node(symbol("unorderd_list_part", false), parse_tree)
collapse_node(symbol("if_comp_pred", false), parse_tree)
collapse_node(symbol("declaration_block", false), parse_tree)
collapse_node(symbol("type_list", false), parse_tree)
collapse_node(symbol("opt_type_list", false), parse_tree)
collapse_node(symbol("template_param_list", false), parse_tree)
collapse_node(symbol("trait_list", false), parse_tree)
collapse_node(symbol("dec_type", false), parse_tree)
}
fun remove_node(remove: symbol, parse_tree: *tree<symbol>) {
var to_process = stack<*tree<symbol>>()
to_process.push(parse_tree)
while(!to_process.empty()) {
var node = to_process.pop()
for (var i = 0; i < node->children.size; i++;) {
if (!node->children[i] || node->children[i]->data.equal_wo_data(remove)) {
node->children.remove(i)
i--;
} else {
to_process.push(node->children[i])
}
}
}
}
fun collapse_node(remove: symbol, parse_tree: *tree<symbol>) {
var to_process = stack<*tree<symbol>>()
to_process.push(parse_tree)
while(!to_process.empty()) {
var node = to_process.pop()
for (var i = 0; i < node->children.size; i++;) {
if (node->children[i]->data.equal_wo_data(remove)) {
var add_children = node->children[i]->children;
// stick child's children between the current children divided
// on i, without including i
node->children = node->children.slice(0,i) +
add_children + node->children.slice(i+1,-1)
i--;
} else {
to_process.push(node->children[i])
}
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,160 +0,0 @@
import str;
import vec;
import mem:*
ext fun printf(fmt_str: *char, ...): int
ext fun fprintf(file: *void, format: *char, ...): int
ext fun fflush(file: int): int
ext var stderr: *void
ext fun fgets(buff: *char, size: int, file: *void): *char
ext var stdin: *void
// dead simple stdin
fun get_line(prompt: str::str, line_size: int): str::str {
print(prompt)
return get_line(line_size)
}
fun get_line(line_size: int): str::str
return get_line(line_size, stdin)
fun get_line(line_size: int, file: *void): str::str {
var buff = new<char>(line_size)
if fgets(buff, line_size, file) == null<char>() {
delete(buff)
return str::str("***EOF***")
}
var to_ret = str::str(buff)
delete(buff)
return to_ret.slice(0,-2) // remove '\n'
}
fun printlnerr<T>(toPrint: T) : void {
printerr(toPrint)
printerr("\n")
}
fun printlnerr()
printerr("\n")
fun printerr(toPrint: str::str) : void {
var charArr = toPrint.toCharArray()
printerr(charArr)
delete(charArr)
}
fun printerr(toPrint: *char) : void {
fprintf(stderr, "%s", toPrint)
// stderr is already flushed
}
fun println<T>(toPrint: T) : void {
print(toPrint)
print("\n")
}
fun print(toPrint: *char) : void {
printf("%s", toPrint)
fflush(0)
}
fun println()
print("\n")
fun print(toPrint: char) : void
print(str::str(toPrint))
fun print(toPrint: str::str) : void {
var charArr = toPrint.toCharArray()
print(charArr)
delete(charArr)
}
fun print(toPrint: bool) {
if (toPrint)
print("true")
else
print("false")
}
fun print<T>(toPrint: T): void
print(str::to_string(toPrint))
// Ok, just some DEAD simple file io for now
ext fun fopen(path: *char, mode: *char): *void
ext fun fclose(file: *void): int
// fprintf is already used for stderr above
ext fun ftell(file: *void): long
ext fun fseek(file: *void, offset: long, whence: int): int
ext fun fread(ptr: *void, size: ulong, nmemb: ulong, file: *void): ulong
ext fun fwrite(ptr: *void, size: ulong, nmemb: ulong, file: *void): ulong
fun file_exists(path: str::str): bool {
var char_path = path.toCharArray()
defer delete(char_path)
var fp = fopen(char_path, "r")
if (fp) {
fclose(fp)
return true
}
return false
}
fun read_file(path: str::str): str::str {
if (!file_exists(path))
return str::str()
var toRet.construct(read_file_binary(path)): str::str
return toRet
}
fun write_file(path: str::str, data: str::str) {
var char_path = path.toCharArray()
defer delete(char_path)
var char_data = data.toCharArray()
defer delete(char_data)
var fp = fopen(char_path, "w")
fprintf(fp, "%s", char_data)
fclose(fp)
}
fun read_file_binary(path: str::str): vec::vec<char> {
var char_path = path.toCharArray()
defer delete(char_path)
var fp = fopen(char_path, "r")
fseek(fp, (0) cast long, 2)// fseek(fp, 0L, SEEK_END)
var size = ftell(fp)
fseek(fp, (0) cast long, 0)//fseek(fp, 0L, SEEK_SET)
var data = new<char>((size+1) cast int)
var readSize = fread((data) cast *void, (1) cast ulong, (size) cast ulong, fp)
fclose(fp)
data[readSize] = 0
var toRet.construct((size) cast int): vec::vec<char>
for (var i = 0; i < size; i++;)
toRet.add(data[i])
delete(data)
return toRet
}
fun write_file_binary(path: str::str, vdata: vec::vec<char>) {
var char_path = path.toCharArray()
defer delete(char_path)
var data = vdata.getBackingMemory()
var size = vdata.size
var fp = fopen(char_path, "wb")
fwrite((data) cast *void, (1) cast ulong, (size) cast ulong, fp)
fclose(fp)
}
fun BoldRed(): void{
print("\033[1m\033[31m");
}
fun BoldGreen(): void{
print("\033[1m\033[32m");
}
fun BoldYellow(): void{
print("\033[1m\033[33m");
}
fun BoldBlue(): void{
print("\033[1m\033[34m");
}
fun BoldMagenta(): void{
print("\033[1m\033[35m");
}
fun BoldCyan(): void{
print("\033[1m\033[36m");
}
fun Reset(): void{
print("\033[0m");
}

View File

@@ -1,89 +0,0 @@
import regex
import symbol
import str
import vec
import util
fun lexer(regs: vec::vec<regex::regex>): lexer {
/*var toRet:lexer*/
var toRet.construct() :lexer
regs.for_each( fun(reg: regex::regex) {
toRet.add_regex(util::make_pair(reg.regexString, reg));
})
return toRet
}
fun lexer(regs: vec::vec<util::pair<symbol::symbol, regex::regex>>): lexer {
/*var toRet:lexer*/
var toRet.construct() :lexer
regs.for_each( fun(reg: util::pair<symbol::symbol, regex::regex>)
toRet.add_regex(util::make_pair(reg.first.name, reg.second));
)
return toRet
}
obj lexer (Object) {
var regs: vec::vec<util::pair<str::str, regex::regex>>
var input: str::str
var position: int
var line_number: int
fun construct(): *lexer {
regs.construct()
input.construct()
position = 0
line_number = 1
return this
}
fun destruct() {
regs.destruct()
input.destruct()
}
fun copy_construct(old: *lexer) {
regs.copy_construct(&old->regs)
input.copy_construct(&old->input)
position = old->position
line_number = old->line_number
}
fun operator=(old: lexer) {
destruct()
copy_construct(&old)
}
fun add_regex(name: str::str, newOne: regex::regex) {
regs.add(util::make_pair(name,newOne))
}
fun add_regex(newOne: util::pair<str::str,regex::regex>) {
regs.add(newOne)
}
fun add_regex(newOne: regex::regex) {
regs.add(util::make_pair(newOne.regexString, newOne))
}
fun add_regex(newOne: *char) {
regs.add(util::make_pair(str::str(newOne), regex::regex(newOne)))
}
fun set_input(in: ref str::str) {
position = 0
line_number = 1
input = in
}
fun next(): symbol::symbol {
if (position >= input.length())
return symbol::eof_symbol()
var max = -1
var max_length = -1
for (var i = 0; i < regs.size; i++;) {
var new_length = regs[i].second.long_match(input.getBackingMemory(), position, input.length())
if (new_length > max_length) {
max = i
max_length = new_length
}
}
if (max < 0)
return symbol::invalid_symbol()
for (var i = position; i < position+max_length; i++;)
if (input[i] == '\n')
line_number++
position += max_length
return symbol::symbol(regs[max].first, true, input.slice(position-max_length, position), line_number)
}
}

View File

@@ -1,130 +0,0 @@
import vec
import mem
import io
import serialize
import util
fun map<T,U>(): map<T,U> {
var toRet.construct(): map<T,U>
return toRet
}
fun map<T,U>(key: ref T, value: ref U): map<T,U> {
var toRet.construct(): map<T,U>
toRet.set(key, value)
return toRet
}
obj map<T,U> (Object, Serializable) {
var keys: vec::vec<T>
var values: vec::vec<U>
fun construct(): *map<T,U> {
keys.construct()
values.construct()
return this
}
fun copy_construct(old: *map<T,U>) {
keys.copy_construct(&old->keys)
values.copy_construct(&old->values)
}
fun operator=(rhs: ref map<T,U>) {
keys = rhs.keys
values = rhs.values
}
fun destruct() {
keys.destruct()
values.destruct()
}
fun serialize(): vec::vec<char> {
return serialize::serialize(keys) + serialize::serialize(values)
}
fun unserialize(it: ref vec::vec<char>, pos: int): int {
pos = keys.unserialize(it, pos)
pos = values.unserialize(it, pos)
return pos
}
fun size(): int {
return keys.size
}
// the old unnecessary template to prevent generation
// if not used trick (in this case, changing out U with V)
fun operator==<V>(other: ref map<T,V>): bool {
return keys == other.keys && values == other.values
}
fun operator[]=(key: ref T, value: ref U) {
set(key,value)
}
fun set(key: ref T, value: ref U) {
var keyIdx = keys.find(key)
if (keyIdx >= 0) {
values.set(keyIdx, value)
return;
}
keys.add(key)
values.add(value)
}
fun contains_key(key: ref T): bool {
return keys.contains(key)
}
fun contains_value<V>(value: ref V): bool {
return values.contains(value)
}
fun get(key: ref T): ref U {
var key_loc = keys.find(key)
if (key_loc == -1)
util::error("trying to access nonexistant key-value!")
return values.get(key_loc)
}
fun get_ptr_or_null(key: ref T): *U {
var key_loc = keys.find(key)
if (key_loc == -1)
return mem::null<U>()
return &values.get(key_loc)
}
fun get_with_default(key: ref T, default_val: ref U): ref U {
if (contains_key(key))
return get(key)
return default_val
}
fun reverse_get<V>(value: ref V): ref T {
/*return values.get(keys.find(key))*/
var value_loc = values.find(value)
if (value_loc == -1)
util::error("trying to access nonexistant value-key!")
return keys.get(value_loc)
}
fun remove(key: ref T) {
var idx = keys.find(key)
if (idx < 0)
return;
keys.remove(idx)
values.remove(idx)
}
fun clear() {
keys.clear()
values.clear()
}
fun operator[](key: ref T): ref U {
return get(key)
}
fun for_each(func: fun(T, U):void) {
for (var i = 0; i < keys.size; i++;)
func(keys[i], values[i])
}
fun for_each(func: fun(ref T, ref U):void) {
for (var i = 0; i < keys.size; i++;)
func(keys[i], values[i])
}
fun associate<O,N>(func: fun(T,U): util::pair<O,N>): map<O,N> {
var to_ret = map<O,N>()
for (var i = 0; i < keys.size; i++;) {
var nkv = func(keys[i], values[i])
to_ret[nkv.first] = nkv.second
}
return to_ret
}
fun pop(): util::pair<T,U> {
return util::make_pair(keys.pop(), values.pop())
}
}

View File

@@ -1,29 +0,0 @@
#link("m")
fun fibanacci(num: int): int {
var l1 = 1
var l2 = 1
for (var i = 0; i < num; i++;) {
var next = l1 + l2
l2 = l1
l1 = next
}
return l1
}
/*********************
* Trig Functions
********************/
ext fun atan(arg: double): double
ext fun atan2(x: double, y: double): double
ext fun acos(arg: double): double
ext fun asin(arg: double): double
ext fun tan(arg: double): double
ext fun cos(arg: double): double
ext fun sin(arg: double): double
fun mod(x: double, y: double): double
{
var intAns = x / y;
return x - intAns*y;
}

View File

@@ -1,146 +0,0 @@
import vec:*;
import io:*;
obj matrix (Object) {
var data: vec<double>;
var rows: int;
var cols: int;
///******************************
// Constructors
///*****************************/
//Constructor with no arguments
//No matrix is made
fun construct(): *matrix {
rows = 0;
cols = 0;
data.construct();
return this;
}
//Constructor with single argument
//Creates an N x N matrix
fun construct(size: int): *matrix {
rows = size;
cols = size;
data.construct(rows*cols);
return this;
}
//Constructor with two arguments
//Creates an N x M matrix
fun construct(r: int, c: int): *matrix {
rows = r;
cols = c;
data.construct(rows*cols);
return this;
}
///****************************
// Utility Functions
///***************************/
//Test using indexing at 0
fun test0(i: int, j: int): bool {
var index = i*rows + j;
if(index > (rows * cols - 1) ) {
print("Index (");
print(i);
print(", ");
print(j);
println(") is out of bounds.");
print("Max index = (");
print(rows-1);
print(", ");
print(cols-1);
println(").");
return false;
}
return true;
}
//Test using indexing at 1
fun test1(i: int, j: int): bool {
var index = (i-1)*rows + (j-1);
if(index > (rows * cols - 1) ) {
print("Index (");
print(i);
print(", ");
print(j);
println(") is out of bounds.");
print("Max index = (");
print(rows);
print(", ");
print(cols);
println(").");
return false;
}
return true;
}
//Access matrix element
fun at(i: int, j: int): double {
var index = i*rows + j;
if(test0(i,j))
return data.at(index);
return 0;
}
//Set matrix element
fun set(i: int, j: int, num: double): void {
var index = i*rows + j;
if(test0(i,j))
data.set(index,num);
return;
}
fun printMatrix(): void {
for(var i: int = 0; i < rows; i++;)
{
for(var j: int = 0; j < cols; j++;)
{
print(at(i,j));
print(" ");
}
println(" ");
}
return;
}
///**************************
// Linear Algebra Functions
//**************************/
fun transpose(): void {
var val1: double;
var val2: double;
for(var n: int = 0; n <= rows - 2; n++;)
for(var m: int = n+1; m <= rows - 1; m++;){
val1 = at(n, m);
val2 = at(m, n);
set(n, m, val2);
set(m, n, val1);
}
return;
}
};//end Matrix class

View File

@@ -1,126 +0,0 @@
ext fun malloc(size: ulong): *void
ext fun free(size: *void)
ext fun memmove(dest: *void, src: *void, size: ulong): *void
fun calloc(size: ulong): *void {
var to_ret = malloc(size)
for (var i = 0; i < size; i++;)
*((to_ret) cast *char + i) = 0
return to_ret
}
fun null<T>(): *T
return (0) cast *T
fun new<T>(count: int): *T
return (malloc( (#sizeof<T> * count ) cast ulong )) cast *T
fun new<T>(): *T
return new<T>(1)
/* We specilize on the trait Object to decide on whether or not the destructor should be called */
fun delete<T>(toDelete: *T, itemCount: int)
delete<T>(toDelete)
/* Calling this with itemCount = 0 allows you to delete destructable objects without calling their destructors. */
fun delete<T(Object)>(toDelete: *T, itemCount: int): void {
// start at one because the actual delete will call the destructor of the first one as it
// finishes the pointer
for (var i: int = 0; i < itemCount; i++;)
toDelete[i].destruct();
free((toDelete) cast *void);
}
/* We specilize on the trait Object to decide on whether or not the destructor should be called */
fun delete<T>(toDelete: *T)
free((toDelete) cast *void)
fun delete<T(Object)>(toDelete: *T): void {
toDelete->destruct();
free((toDelete) cast *void);
}
// a wrapper for construct if it has the Object trait
fun maybe_construct<T>(it:*T):*T
return it
fun maybe_construct<T(Object)>(it:*T):*T
return it->construct()
// a wrapper for copy constructing if it has the Object trait
fun maybe_copy_construct<T>(to:*T, from:*T)
*to = *from
fun maybe_copy_construct<T(Object)>(to:*T, from:*T)
to->copy_construct(from)
// a wrapper for destruct if it has the Object trait
fun maybe_destruct<T>(it:*T) {}
fun maybe_destruct<T(Object)>(it:*T)
it->destruct()
obj shared_ptr<T> (Object){
var data: *T;
var refCount: int;
fun construct(): *shared_ptr<T> {
data = 0;
refCount = 1;
return this;
}
fun construct(newPtr: *T): *shared_ptr<T> {
data = newPtr;
refCount = 1;
return this;
}
fun construct(newPtr: ref shared_ptr<T>): *shared_ptr<T> {
data = newPtr.data;
refCount = newPtr.refCount;
refCount++;
return this;
}
fun destruct(): void {
if(refCount == 1){
delete(data,1);
refCount--;
}
}
fun operator*(): ref T {
return *data;
}
fun operator->(): *T {
return data;
}
fun operator=(newPtr: ref shared_ptr<T>): ref shared_ptr<T> {
if(this != &newPtr){
if(refCount == 1){
delete(data,1);
refCount--;
}
//use copy constructor here???
data = newPtr.data;
refCount = newPtr.refCount;
refCount++;
}//end self-assignment check
return *this;
}
fun operator=(newPtr: ref *T): ref shared_ptr<T> {
data = newPtr;
refCount = 1;
delete(newPtr,1);
return *this;
}
}; //end shared_ptr class

View File

@@ -1,35 +0,0 @@
import symbol:*
import tree:*
import vec:*
import map:*
import util:*
import str:*
import mem:*
import io:*
import ast_nodes:*
import ast_transformation:*
import pass_common:*
fun node_counter(name_ast_map: *map<str, pair<*tree<symbol>,*ast_node>>, ast_to_syntax: *map<*ast_node, *tree<symbol>>) {
var counter = node_counter_helper(name_ast_map, ast_to_syntax)
println(str("Number of nodes touched: ") + counter)
}
fun node_counter_test(name_ast_map: *map<str, pair<*tree<symbol>,*ast_node>>, ast_to_syntax: *map<*ast_node, *tree<symbol>>) {
var counter = node_counter_helper(name_ast_map, ast_to_syntax)
if (counter > 10000)
println("more than 10000 nodes!")
}
fun node_counter_helper(name_ast_map: *map<str, pair<*tree<symbol>,*ast_node>>, ast_to_syntax: *map<*ast_node, *tree<symbol>>): int {
var counter = 0
var visited = hash_set<*ast_node>()
name_ast_map->for_each(fun(name: str, syntax_ast_pair: pair<*tree<symbol>,*ast_node>) {
var helper = fun(node: *ast_node, parent_chain: *stack<*ast_node>) {
counter++
}
run_on_tree(helper, empty_pass_second_half(), syntax_ast_pair.second, &visited)
})
return counter
}

Some files were not shown because too many files have changed in this diff Show More