Confidentiality is how data is managed by the system. Confidentiality is a key pillar in trust. A system must inform the user how the data collected is intended to be used and use it in that manner. This notification must be in clear and concise terms that are easy to understand. A long “Terms of Use” document is never read and often times simply ignored.
The simple statement would be: “The information provided by you is only used by others when you give explicit permission to do so.” This is hard to implement. It means that logs cannot have user data in it of any kind. It means that all data is private until the user gives permission to share that data with the system. This type of system requires more work by the user but can also provide greater reassurance that should a breach occur, no user data will be lost. As we will see, it also makes compliance easier.
The initial reaction to reading that typically is at least one of the following: “What about technical support?”, “What about logging?”, or in a business to business software as a service application the question is “Doesn’t my company administrator have access?”. The first two questions will be answered later. In the case of the later, there will be cases where a default access to other users has been granted. This access must be clearly identified when the user first logs in. In an ideal world, there would be reminders when the user logs in. This would be similar to the messages that employees see when they log into computers that are monitored.
A confidential system must be able accurately identify the user of the system. User identification is called authentication. There are many types of authentication that exist, such as “Username and Password”, “Biometric”, “OAUTH”, “SAML”, and more. Each of these has different benefits and drawbacks. The decision on which authentication methodology is somewhat dependent on the use case of the application, but ultimately digging into the details of authentication types can over complicate the general discussion of software security. The important take away here is that any system must verify the identity of the user before granting access in order to have any level of confidentiality.
The second step in designing a confidential system is how to manage access privileges (called authorization). There are three principles that are regularly used. They are “Deny By Default”, “Least Privilege”, and “Need to Know.”
“Deny By Default” states that any access request should be denied unless explicitly allowed. Think of this as a list to get into a party with a security guard checking identification at the door.
“Least Privilege” is a principle that dictates a user gets the access they need to in order to use the system for the activities they are intending to. An example of this would be in an education application, a student would only have the ability to complete and submit assignments. The students do not have the ability to grade assignments. That is the job of the teacher. Students can also read their cumulative grades where teachers can edit the grades.
A system which implements “Need To Know” factors in ownership of the information. Looking back at the students example, each student only needs to know about their assignments and their grades. They do not need to know about the assignments and grades of other students in their class.
There are several strategies for implementing these three principles. Two of the most common ones are role based access controls (RBAC) and action based access controls (ABAC). RBAC typically assigns users to groups and assigns roles to those groups. When a component or action is accessed, the system will verify the role of the user is in the roles allowed by that component or action. ABAC tracks attributes of the user and the information being acted upon and determines whether access should be granted based on the mappings.
The final main discussion point of confidentiality is how to hide data. There are typically two ways to accomplish this. They are hashing and encrypting. Hashing and encrypting data are similar methods in that they take data in its original form and convert them to what appears to be some series of random data elements. When it comes to hashing, the process cannot be reversed. Encryption on the other hand can be reversed. We’ll go into a little more technical detail on that in a future section.
It is important to note here that privacy is often when talking about confidentiality. Some will argue they are different. But they are effectively the same. A confidential system will be one that maintains privacy of the end user. It will treat all data elements, even those at a system level, as information between the system and the user. An example would be that a system admin would never need to see the grades of the students, even in a log file. A technical support person should be granted access to read a paper if the student is asking for support to see why a file is not accessible. This will definitely make things harder, but it in the long run, it will actually make systems less complex.
The final point here is that confidentiality of a system refers to all aspects of the system. This means where ever data is saved, however data is transferred, and whomever has access to the system. This can be a difficult task. However, with the goal of having a completely confidential system in mind, product owners, software engineers, and customer support will behave in a manner that will get an organization closer than ever before.