security design in distributed computing applications - CiteSeerX [PDF]

the same structured approach as risk management; assets are determined, security vulnerabilities located, and ..... wher

0 downloads 6 Views 1MB Size

Report

Download PDF

PNG Network

Recommend Stories

PDF Books Distributed Computing: Principles and Applications

Don’t grieve. Anything you lose comes round in another form. Rumi

Distributed Computing

Keep your face always toward the sunshine - and shadows will fall behind you. Walt Whitman

Enhancing Security in Distributed Systems with Trusted Computing Hardware

Your task is not to seek for love, but merely to seek and find all the barriers within yourself that

Distributed event-based computing

Love only grows by sharing. You can only have more for yourself by giving it away to others. Brian

Distributed Computing with Spark

The best time to plant a tree was 20 years ago. The second best time is now. Chinese Proverb

Coding for Distributed Computing

You're not going to master the rest of your life in one day. Just relax. Master the day. Than just keep

Distributed Cloud Computing

Nothing in nature is unbeautiful. Alfred, Lord Tennyson

Distributed Computing with Spark

Open your mouth only if what you are going to say is more beautiful than the silience. BUDDHA

Security Challenges in Mobile Computing

Goodbyes are only for those who love with their eyes. Because for those who love with heart and soul

Security in Computing, Fifth Edition

Come let us be friends for once. Let us make life easy on us. Let us be loved ones and lovers. The earth

Idea Transcript

SECURITY DESIGN IN DISTRIBUTED COMPUTING APPLICATIONS by Michael P. Zeleznik

A dissertation submitted to the faculty of The University of Utah in partial ful llment of the requirements for the degree of

Doctor of Philosophy

Department of Computer Science The University of Utah December 1993

c Michael P. Zeleznik 1993 Copyright All Rights Reserved

THE UNIVERSITY OF UTAH GRADUATE SCHOOL

SUPERVISORY COMMITTEE APPROVAL of a dissertation submitted by Michael P. Zeleznik

This dissertation has been read by each member of the following supervisory committee and by majority vote has been found to be satisfactory.

Chair:

Lee Hollaar

Robert Kessler

David Hanscom

THE UNIVERSITY OF UTAH GRADUATE SCHOOL

FINAL READING APPROVAL To the Graduate Council of the University of Utah: in its nal form I have read the dissertation of Michael P. Zeleznik and have found that (1) its format, citations, and bibliographic style are consistent and acceptable; (2) its illustrative materials including gures, tables, and charts are in place; and (3) the nal manuscript is satisfactory to the Supervisory Committee and is ready for submission to The Graduate School. Date

Lee Hollaar

Chair, Supervisory Committee

Approved for the Major Department

Thomas C. Henderson Chair/Dean

Approved for the Graduate Council

Ann W. Hart

Dean of The Graduate School

ABSTRACT The software developer designing a security architecture for a distributed application is often faced with practical constraints that further complicate an already dicult task. These include limited resources and con icting requirements. The goal will often be to simply provide as much eective security as possible, targeted at the end-user security needs. To achieve this goal, the developer must be able to systematically determine where security problems exist, understand the impact of security mechanisms as they are designed, determine which problems have and have not been addressed, explore alternative designs, and build on the architecture in the future. Current approaches to secure system design do not meet these requirements. Although much is understood about many aspects of computer security, little attention has been given to the the issue of how to integrate this knowledge into a design process; of how to generate and maintain a security architecture in a systematic, predictable manner. In this dissertation, this issue is examined in the context of a distributed information retrieval system. The security design problem is analyzed, de ning and characterizing its underlying causes and the tools desirable to address it. This is followed by a classi cation and analysis of current security design approaches in relation to this problem, including a detailed case history of our eorts to employ risk management paradigms, demonstrating their strengths and limitations. A new security design methodology is then presented, which provides the desired tools. All aspects of the application, supporting software, hardware, physical environments, and attack scenarios are modeled in a uni ed, object-oriented manner. An information ow analysis is applied to automatically discover security violations. Safeguards can then be modeled and added to the ow analysis to readily assess their eects and interrelationships. Although the developer must create the application-speci c models, the remaining models such as those for the operating system, hardware, environments, and attack scenarios are application-independent, and can evolve over time to represent generic security knowledge bases. These can be utilized by any developer employing this design methodology, regardless of their security background.

To Terri, for putting up with me...

CONTENTS ABSTRACT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : GLOSSARY : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : CHAPTERS 1. INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Overview of this Research : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Signi cance of this Research : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Need for Security in Distributed Systems : : : : : : : : : : : : : : : : : : : : : The Security Design Problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1.4.1 Fundamental Diculties in Security Design : : : : : : : : : : : : : : : 1.4.2 Concerns of the Application Developer : : : : : : : : : : : : : : : : : : : 1.4.3 Security Needs of the Application Developer : : : : : : : : : : : : : : : 1.5 Current Lack of Design Tools : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1.6 A Proposed Solution : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1.7 Road Map to this Dissertation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

1.1 1.2 1.3 1.4

2. BACKGROUND ISSUES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

2.1 Information Retrieval Systems : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2.1.1 What They Are : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2.1.2 The URSA System : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2.1.3 Security Needs in Information Retrieval : : : : : : : : : : : : : : : : : : 2.2 Overview of Related Work : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2.2.1 Brief Historical Perspective on Computer Security : : : : : : : : : : 2.2.2 The Bigger Picture : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

3. THE SECURITY DESIGN PROBLEM : : : : : : : : : : : : : : : : : : : : :

3.1 Fundamental Diculties in Security Design : : : : : : : : : : : : : : : : : : : 3.1.1 Examples of the Diculties : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3.1.2 The Lack of a Design Methodology : : : : : : : : : : : : : : : : : : : : : : 3.2 Concerns of the Application Developer : : : : : : : : : : : : : : : : : : : : : : : 3.2.1 Practical Considerations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3.2.2 End-User Physical Environment : : : : : : : : : : : : : : : : : : : : : : : : 3.2.3 End-User Security Needs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3.2.4 Some Security Design Scenarios : : : : : : : : : : : : : : : : : : : : : : : : : 3.3 Design Tools for the Application Developer : : : : : : : : : : : : : : : : : : : :

v xiii xv 1 1 2 4 4 4 5 5 6 7 7 9 9 9 10 10 11 11 12 14 14 15 17 18 19 19 20 21 22

4. APPROACHES TO SECURE SYSTEM DESIGN : : : : : : : : : : : : 24 A Lack of Existing Design Tools : : : : : : : : : : : : : : : : : : : : : : : : : : : : A Categorization of Current Design Eorts : : : : : : : : : : : : : : : : : : : : Find and Fix Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Preventive Design Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Formal Methods : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4.5.1 Speci cation and Veri cation : : : : : : : : : : : : : : : : : : : : : : : : : : 4.5.2 Formal Policy Models : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4.6 Guideline Adherence : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4.6.1 Comprehensive References : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4.6.2 Guidelines and Standards : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4.7 Risk Management : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4.8 Safe and Reliable Systems Design : : : : : : : : : : : : : : : : : : : : : : : : : : : 4.9 Conventional Software Development : : : : : : : : : : : : : : : : : : : : : : : : : 4.10 Experience : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4.11 Recent Work Supporting Our Position : : : : : : : : : : : : : : : : : : : : : : : 4.12 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

4.1 4.2 4.3 4.4 4.5

5. RISK MANAGEMENT TECHNIQUES : : : : : : : : : : : : : : : : : : : : :

5.1 De nition of Terms : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.1.1 Risk Management : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.1.2 Determine Assets : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.1.3 Determine Possible Threats Against Assets : : : : : : : : : : : : : : : : 5.1.4 Determine Possible Vulnerabilities : : : : : : : : : : : : : : : : : : : : : : : 5.1.5 Rank Vulnerabilities in Terms of Risk : : : : : : : : : : : : : : : : : : : : 5.1.6 Design Safeguards to Address Vulnerabilities : : : : : : : : : : : : : : 5.2 A Note on Practical Approaches : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.3 The Issue of Risk Analysis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

6. ATTEMPTS TO EMPLOY RISK MANAGEMENT : : : : : : : : : : 6.1 Starting to List Assets : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.1.1 Initial Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.1.2 Diculties with Listing Assets : : : : : : : : : : : : : : : : : : : : : : : : : 6.1.3 A Relational Assets Structure : : : : : : : : : : : : : : : : : : : : : : : : : : 6.1.4 Employing Checklists to Add Structure : : : : : : : : : : : : : : : : : : : 6.1.5 Fundamental Problem with Assets Lists : : : : : : : : : : : : : : : : : : 6.2 Threats and Vulnerability Lists : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.2.1 Ill-de ned, Less Tangible than Assets : : : : : : : : : : : : : : : : : : : : 6.2.2 Threats Lists : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.2.3 Vulnerability Lists : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.3 The Information Explosion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.4 Safeguard Design : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.4.1 Trojan Horse Example : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.4.2 The Problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.5 Simplifying Assets and Threats Lists : : : : : : : : : : : : : : : : : : : : : : : : : 6.6 Assessing the Problems at This Point : : : : : : : : : : : : : : : : : : : : : : : : viii

24 25 26 27 28 28 29 31 31 32 35 36 38 39 39 40 42 42 42 43 43 44 44 45 45 46 48 49 49 50 51 53 53 54 54 55 56 56 58 58 59 61 62

6.6.1 Overview : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.6.2 Synopsis of the Issues : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.7 Need for Dynamic Information : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.7.1 The Control Point Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.7.2 Data Flow Analysis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.7.3 Dependency Graphs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.7.4 Problems with Dependency Graphs : : : : : : : : : : : : : : : : : : : : : : 6.8 Need for Detailed Dynamic Analysis : : : : : : : : : : : : : : : : : : : : : : : : : 6.9 Some Security Holes Provide Insight : : : : : : : : : : : : : : : : : : : : : : : : : 6.9.1 Document List Access Control Filter : : : : : : : : : : : : : : : : : : : : : 6.9.2 The Index Spell Command : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.9.3 Auto Search and Retrieval Functions : : : : : : : : : : : : : : : : : : : : 6.10 A Note on Security Policy : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.11 Assessing the Problems at This Point : : : : : : : : : : : : : : : : : : : : : : : : 6.11.1 Overview : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.11.2 Synopsis of the Issues : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.12 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

62 63 63 64 64 69 69 72 74 74 75 75 75 76 76 77 78 7. A SECURITY DESIGN METHODOLOGY : : : : : : : : : : : : : : : : : : 81 7.1 Background : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 82 7.1.1 The Focus of These Chapters : : : : : : : : : : : : : : : : : : : : : : : : : : 82 7.1.2 Why a Risk Management Foundation? : : : : : : : : : : : : : : : : : : : 83 7.1.3 Security Policy : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 83 7.2 Determining Assets : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 85 7.3 Information Flow Analysis (IFA) : : : : : : : : : : : : : : : : : : : : : : : : : : : : 86 7.3.1 The Meaning of Information Flow : : : : : : : : : : : : : : : : : : : : : : : 87 7.3.2 Local IFA : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 89 7.3.3 How to Produce the LIFA : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 89 7.3.4 A LIFA Example : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 91 7.3.5 Global IFA : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 91 7.3.6 A GIFA Example : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 93 7.4 Notes on the Flow Analysis Process : : : : : : : : : : : : : : : : : : : : : : : : : 94 7.4.1 The LIFA Development : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 94 7.4.2 The Flow Analysis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 96 7.4.3 Synopsis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 97 7.5 The Rest of the Story : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 97 7.5.1 Application versus the Environment : : : : : : : : : : : : : : : : : : : : : 98 7.5.2 Environment Planned versus Unplanned Operations : : : : : : : : : 99 7.5.3 Application Planned versus Unplanned Operations : : : : : : : : : : 100 7.5.4 The Complete Hierarchical Breakdown : : : : : : : : : : : : : : : : : : : 102 7.5.5 Signi cance of this Breakdown : : : : : : : : : : : : : : : : : : : : : : : : : 102 7.5.6 The Remaining Sections : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 104 7.6 Viewing the World as Flows : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 104 7.7 Environmental Planned Operations : : : : : : : : : : : : : : : : : : : : : : : : : : 105 7.7.1 The Key Modeling Concepts : : : : : : : : : : : : : : : : : : : : : : : : : : : 105 7.7.2 A Layered, Object-based Approach : : : : : : : : : : : : : : : : : : : : : : 107 ix

7.7.3 Explicit vs. Implicit Service Providers : : : : : : : : : : : : : : : : : : : : 108 7.7.4 Objects Providing Explicit Services to the Application : : : : : : : 111 7.7.5 The Remaining Issues : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 114 7.7.6 The Parasite Concept : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 115 7.7.7 External versus Internal Security Characteristics : : : : : : : : : : : 119 7.7.8 Now for the Distributed Aspects : : : : : : : : : : : : : : : : : : : : : : : : 125 7.7.9 The Physical Environment : : : : : : : : : : : : : : : : : : : : : : : : : : : : 127 7.7.10 Explicit Providers to Nonapplication Objects : : : : : : : : : : : : : : 129 7.7.11 Complexity and Reusability : : : : : : : : : : : : : : : : : : : : : : : : : : : : 129 7.8 Environmental Unplanned Operations : : : : : : : : : : : : : : : : : : : : : : : : 130 7.8.1 Perspective : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 130 7.8.2 The Attack Modules (AMODs) : : : : : : : : : : : : : : : : : : : : : : : : : 131 7.8.3 Details of the AMOD Development : : : : : : : : : : : : : : : : : : : : : : 133 7.8.4 Reusable Modules from an Evolutionary Process : : : : : : : : : : : 134 7.9 Application Unplanned Operations : : : : : : : : : : : : : : : : : : : : : : : : : : 136 7.9.1 What They Are : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 136 7.9.2 Adding Pseudo-Operations : : : : : : : : : : : : : : : : : : : : : : : : : : : : 137 7.10 The Power of this Modeling Approach : : : : : : : : : : : : : : : : : : : : : : : : 138 7.11 Synopsis of the Design Method at This Point : : : : : : : : : : : : : : : : : : 140 8. CALCULATING THE FLOW ANALYSIS : : : : : : : : : : : : : : : : : : : 151 8.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 151 8.2 Roots of the Complexity : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 152 8.3 Approaches to the Flow Analysis Problem : : : : : : : : : : : : : : : : : : : : 154 8.3.1 Security Information Flow Analysis : : : : : : : : : : : : : : : : : : : : : : 154 8.3.2 Data Flow Analysis in General : : : : : : : : : : : : : : : : : : : : : : : : : 155 8.3.3 Petri Net Modeling : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 157 8.3.4 Algebraic Approaches : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 158 8.4 A Simulation-Based Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 159 8.4.1 Partitioning the Simulation : : : : : : : : : : : : : : : : : : : : : : : : : : : : 159 8.4.2 Security Relevance Information : : : : : : : : : : : : : : : : : : : : : : : : : 160 8.4.3 The Following Sections : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 161 8.5 Simulating the Application Group : : : : : : : : : : : : : : : : : : : : : : : : : : : 161 8.5.1 Application Planned Operations : : : : : : : : : : : : : : : : : : : : : : : : 161 8.5.2 Application Unplanned Operations : : : : : : : : : : : : : : : : : : : : : : 162 8.5.3 Explicit Application Service Operations : : : : : : : : : : : : : : : : : : 163 8.6 How to Drive the Application Simulation : : : : : : : : : : : : : : : : : : : : : 163 8.7 Flow Calculations and Policy Checks : : : : : : : : : : : : : : : : : : : : : : : : : 164 8.7.1 Statically Assigned Variables : : : : : : : : : : : : : : : : : : : : : : : : : : : 164 8.7.2 Nonassigned Variables : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 165 8.7.3 The Key to Policy Violation Calculations : : : : : : : : : : : : : : : : : 167 8.7.4 Transport Variables : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 168 8.7.5 State Variables : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 169 8.8 Simulating the Environment Group : : : : : : : : : : : : : : : : : : : : : : : : : : 169 8.8.1 The Environment as Separate from the Application : : : : : : : : : 169 8.8.2 Problems Simulating the Environment Group : : : : : : : : : : : : : : 170 x

8.9 Searching for Ways to Simplify : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 171 8.9.1 The Planned Environment : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 172 8.9.2 The Unplanned Environment : : : : : : : : : : : : : : : : : : : : : : : : : : 172 8.9.3 Rate of Change of Application and Environment : : : : : : : : : : : 173 8.10 Method of Simulation and Flow Analysis : : : : : : : : : : : : : : : : : : : : : 174 8.10.1 Simplifying Application/Environment Interaction : : : : : : : : : : : 174 8.10.2 Analysis of This Simpli cation : : : : : : : : : : : : : : : : : : : : : : : : : 176 8.10.3 Top-Level FSM (TLFSM) for Synchronization : : : : : : : : : : : : : 177 8.10.4 Handling ASRVs Across TLFSM States : : : : : : : : : : : : : : : : : : 177 8.11 Complete View of the Flow Analysis : : : : : : : : : : : : : : : : : : : : : : : : : 178 8.11.1 Brief Review of Types of SRV : : : : : : : : : : : : : : : : : : : : : : : : : : 178 8.11.2 Calculating the Flows : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 179 8.11.3 Initialization Issues : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 180 8.11.4 Ghost Objects : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 180 8.11.5 Object Relocation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 180 8.11.6 Policy Validation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 181 8.11.7 Notes on Specifying SRV Information : : : : : : : : : : : : : : : : : : : : 186 9. SAFEGUARD DESIGN : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 188 9.1 What is a Safeguard? : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 188 9.2 Fundamental Safeguard Types : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 190 9.3 Safeguard Design Techniques : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 191 9.4 Issues in Safeguard Design : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 193 9.5 Reconciling the Black-and-White View : : : : : : : : : : : : : : : : : : : : : : : 194 9.6 Net-eect versus the Design of Safeguards : : : : : : : : : : : : : : : : : : : : : 195 9.6.1 The Problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 195 9.6.2 Trying to Manually Specify Net Eect : : : : : : : : : : : : : : : : : : : 195 9.6.3 Automatically Assessing the Net Eect : : : : : : : : : : : : : : : : : : : 197 9.7 Specifying Safeguard Design at Three Levels : : : : : : : : : : : : : : : : : : : 198 9.7.1 The Issue : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 198 9.7.2 A Three Level Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 198 9.7.3 Safeguard Behavioral Description (SAFE-BD) : : : : : : : : : : : : : 199 9.7.4 Safeguard Design Descriptions (SAFE-DD) : : : : : : : : : : : : : : : : 201 9.7.5 Safeguard Conceptual Description (SAFE-CD) : : : : : : : : : : : : : 204 9.8 Mapping Safeguards into our Model : : : : : : : : : : : : : : : : : : : : : : : : : 206 9.8.1 Prevention Safeguards : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 207 9.8.2 Detection Safeguards : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 208 9.8.3 Acceptance Safeguards : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 211 9.8.4 General Audit and Deterrent Safeguards : : : : : : : : : : : : : : : : : : 213 9.8.5 Untargeted Safeguards : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 214 10. SUMMARY OF THE DESIGN METHODOLOGY : : : : : : : : : : : 217 10.1 Information Flow Analysis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 217 10.2 The Rest of the Story : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 218 10.3 Viewing the World as Flows : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 219 10.4 Environmental Planned Operations : : : : : : : : : : : : : : : : : : : : : : : : : : 219 xi

10.4.1 A Layered, Object-Based Approach : : : : : : : : : : : : : : : : : : : : : : 220 10.4.2 Explicit versus Implicit Service Providers : : : : : : : : : : : : : : : : : 221 10.4.3 Modeling Explicit Services to the Application : : : : : : : : : : : : : : 221 10.4.4 Handling the Remaining Issues : : : : : : : : : : : : : : : : : : : : : : : : : 221 10.4.5 The Parasite Concept : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 222 10.4.6 The Physical Environment : : : : : : : : : : : : : : : : : : : : : : : : : : : : 223 10.5 Environmental Unplanned Operations : : : : : : : : : : : : : : : : : : : : : : : : 224 10.6 Application Unplanned Operations : : : : : : : : : : : : : : : : : : : : : : : : : : 224 10.7 Power of this Modeling Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : 225 10.8 Calculating the Flow Analysis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 226 10.8.1 Simulating the Application Group : : : : : : : : : : : : : : : : : : : : : : : 226 10.8.2 Driving the Application Simulation : : : : : : : : : : : : : : : : : : : : : : 226 10.8.3 Simulating the Environment Group : : : : : : : : : : : : : : : : : : : : : : 227 10.8.4 A Way to Simplify This : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 228 10.8.5 Flow Calculations and Policy Checks : : : : : : : : : : : : : : : : : : : : 229 10.9 Safeguard Design : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 230 10.9.1 Reconciling the Black-and-White View : : : : : : : : : : : : : : : : : : : 231 10.9.2 Net-eect versus the Design of Safeguards : : : : : : : : : : : : : : : : : 231 10.9.3 Specifying Safeguard Design at Three Levels : : : : : : : : : : : : : : : 232 10.10Mapping Safeguards into our Model : : : : : : : : : : : : : : : : : : : : : : : : : 232 10.10.1Prevention SAFEs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 232 10.10.2Detection Safeguards : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 232 10.10.3Acceptance Safeguards : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 233 10.10.4General Audit and Deterrent Safeguards : : : : : : : : : : : : : : : : : : 233 10.10.5Untargeted Safeguards : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 234 11. CONCLUSIONS AND FUTURE WORK : : : : : : : : : : : : : : : : : : : : 235 11.1 Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 235 11.2 Future Work : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 237

APPENDICES A. THE URSA SYSTEM : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 238 B. SECURITY POLICY MODELS : : : : : : : : : : : : : : : : : : : : : : : : : : : : 240 C. A RELATIONAL ASSETS LIST : : : : : : : : : : : : : : : : : : : : : : : : : : : : 247 D. THREAT SOURCES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 256 E. THREAT TECHNIQUES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 258 F. SECURITY DESIGN PRINCIPLES : : : : : : : : : : : : : : : : : : : : : : : : 261 REFERENCES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 265 xii

LIST OF FIGURES 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13 7.14 7.15 7.16 7.17

Example of Assets Explosion and Hierarchy : : : : : : : : : : : : : : : : : : : : 52 Model for Simplifying Assets and Threats Analysis : : : : : : : : : : : : : : : 65 Control Points : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 65 High-level Flow Diagram for Entire Application : : : : : : : : : : : : : : : : : 67 Flow Diagram for Database Loading Operation : : : : : : : : : : : : : : : : : : 67 Flow Diagram for Generalized System Operation : : : : : : : : : : : : : : : : 68 Operation Extended to Distributed Environment : : : : : : : : : : : : : : : : 68 Simple Dependency List Example : : : : : : : : : : : : : : : : : : : : : : : : : : : : 70 Dependency Diagram with Threats and Vulnerabilities : : : : : : : : : : : : 70 Full Dependency Diagram with Vulnerabilities : : : : : : : : : : : : : : : : : : 71 Policy Flow Examples : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 88 Control Flow During Find Operation : : : : : : : : : : : : : : : : : : : : : : : : : : 88 LIFA Pseudo Code : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92 LIFA Finite State Machine : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92 LIFA State Machine with Flows : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92 User Interface LIFA : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 95 User LIFA : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 95 World View of Application and its Environment : : : : : : : : : : : : : : : : : 101 Environment with Planned and Unplanned Operations : : : : : : : : : : : : 101 Application and Environment Planned/Unplanned Distinction : : : : : : 109 Global Binding Table : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 109 Model of Communication System : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 112 Model of Nondistributed Filesystem : : : : : : : : : : : : : : : : : : : : : : : : : : : 112 Model of Distributed Filesystem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 113 Simple Parasite MonOS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 120 Example of Utility of MonOS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 120 Parasite Extensions to Global Binding Table : : : : : : : : : : : : : : : : : : : : 120

7.18 7.19 7.20 7.21 7.22 7.23 7.24 7.25 7.26 7.27 7.28 7.29 7.30 7.31 7.32 7.33 7.34 7.35 7.36 7.37 7.38 8.1 8.2 9.1 B.1

Global Binding Table with Parasite Bindings : : : : : : : : : : : : : : : : : : : 123 Operating System with Memory Protection and Authentication : : : : : 123 Modeling Four Operating Systems with Simple Variables : : : : : : : : : : 125 Distributed Operating System : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 141 Distributed OS with Memory Protection and Authentication : : : : : : : 142 Physical Room Environments : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 143 Open vs. Secure Node : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 143 Arbitrary Room : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 144 An Open Node in a Secure Room : : : : : : : : : : : : : : : : : : : : : : : : : : : : 145 Secure Room After Hours : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 145 Arbitrary Person : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 146 Two Instances of an Arbitrary Person : : : : : : : : : : : : : : : : : : : : : : : : : 146 Fault Tree Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 147 AMOD of Node and Disk Attack : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 147 Node AMOD with Variables from Node and Person PMODs : : : : : : : 147 Examples of Safeguard Interaction with AMODs/PMODs : : : : : : : : : 148 Printer with Line to Computer : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 148 Printer Line with State Variable Modeling EMP : : : : : : : : : : : : : : : : : 149 Operating System AMOD : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 149 Fragment of LIFA for User Interface : : : : : : : : : : : : : : : : : : : : : : : : : : 150 Pseudo-Operation Added to LIFA : : : : : : : : : : : : : : : : : : : : : : : : : : : : 150 Auto-Search Feature : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 166 UserIF and Index Flows : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 166 Simple Example of Net Eect versus Design : : : : : : : : : : : : : : : : : : : : 216 Security Policy De nes Responsibility : : : : : : : : : : : : : : : : : : : : : : : : : 246

xiv

GLOSSARY AMOD - Attack modules. Models of the unplanned operation of the environment,

consisting of attacks against environmental objects, utilizing the parasite approach. ASRV - Accumulated SRV. The ASRV attribute of a variable accumulates the set of SRVs that have owed into that variable, generally for the lifetime of a given static environment. See also TSRV. BLP - The Bell and LaPadula security policy model. CFSM - Communicating nite state machine. DOM - Dominates relation. A class X dominates a class Y I X.level Y.level and X.category-set Y.category-set. EMP - Electromagnetic Pickup. FSM - Finite state machine. IC - Integrity Class. A value identifying the level of integrity of an object (e.g., high or low). See also SC. IFA - Information Flow Analysis. GIFA - Global information ow analysis. The ow analysis of a set of communicating FSMs (each represented with a LIFA model). LIFA - Local information ow analysis. The FSM ow model for an application or environmental module. OS - Operating system. MLS - Multi-level secure system. A system in which data and users at dierent security levels can simultaneously exist, but which controls access based on a security policy. PCheck - Policy check. The algorithm to determine if a given ow upholds or violates the security policy. PMOD - Models of the planned operation of the environment, including software and the physical environment, often utilizing the parasite approach.

SAFE - Safeguard. Any mechanism that attempts to reduce loss due to an unde-

sirable event (e.g., a security policy violation). SC - Secrecy Class. A value identifying the level of secrecy of an object (e.g., top secret or unclassi ed). See also IC. SECREL - Refers to the general security relevant aspects of an object or situation (e.g., Is SECREL information present? What are the SECREL aspects of this object?). See also SRV. SRV - The collective security relevance value of an object. For our simple policy, this is the value of its secrecy class (object.SC) and integrity class (object.IC). See also SECREL. SSRV - Static SRV. The SSRV attribute of a variable is an SRV that is manually assigned to fundamental assets. All other variables have NULL SSRVs. TCSEC - (Orange Book) Trusted Computer Security Evaluation Criteria. TEMPEST - Secured against information leakage by emission of electromagnetic radiation. TLFSM - Top level FSM that drives both the application and the environment, allowing the environment to change over time. TSRV - Transport SRV. The TSRV attribute of a variable accumulates the set of SRVs that have owed into that variable, independent of the lifetime of the current static environment. See also ASRV.

xvi

CHAPTER 1 INTRODUCTION 1.1 Overview of this Research

The number of distributed computer applications is increasing, along with the need for security in such applications. Our research began as an eort to design a security architecture for a distributed information retrieval system. This system was based on the Utah Retrieval System Architecture (URSA), a message-based, loosely coupled design, with processes expected to be distributed in a heterogeneous computing environment, ranging from small LANs to global information networks. We faced practical constraints common to most software developers such as limited resources, a previously existing application design, con icts between functionality and security, lack of control of the target environments, and variable end-user security needs. Thus, rather than trying to provide some xed level of security such as targeting one of the government speci ed evaluation levels, the goal was simply to provide as much eective security as possible, targeted at the end user security needs; essentially to see how much could be provided for how little. This would be a common goal for many small system developers facing the same practical constraints. The desire was to drive the security architecture design directly from the security requirements, and to understand precisely what it did, and did not address. However, there is a fundamental diculty in moving from a set of security requirements to a design which enforces them, and in understanding what a particular design achieves. The diculty is that each single security requirement actually embodies both a positive and negative requirement. The positive requirement is the simple functional need, while the negative one is the implicit requirement that nothing be able to subvert the positive requirement. This greatly complicates the normal design process. To achieve our goals, tools were needed that allow one to start with their security requirements, to systematically determine where security problems exist, to understand the net eects of security mechanisms as they are designed, and to readily explore alternative designs. One also must be able to build on the security architecture in the future, as the application or security needs change, and as development personnel change. This implies the need for clear records of these issues, especially what has and has not been addressed and why. Essentially this is no more than one would like in designing any software system. We discovered that current approaches to the design of secure systems simply do not meet these needs, especially for the small developer. Furthermore, very little

2 attention has been given to providing for these needs, or even to acknowledging their existence. We undertook a thorough investigation of how secure systems were being developed, and classi ed the approaches into a number of broad categories which were then analyzed in terms of our needs. Although many dierent approaches achieved many dierent goals, the only one that addressed the security design process in a well-structured manner aligned with our goals was that of risk management. However, this proved nearly impossible to apply to the software design process of distributed, dynamic systems of the URSA class. It became clear that new design tools were needed if there was to be any hope of the small system developer approaching the security design problem in a well-structured manner, and of producing a design with a solid understanding of what has and has not been addressed. As a result of extensive attempts to apply risk management and other approaches to the design problem and analyzing the diculties and underlying causes, we have developed a proposal for a new design methodology which addresses many of these needs. The framework of the methodology provides the same structured approach as risk management. However, the proposed methods and tools to achieve this goal are very dierent. The methodology consists of modeling all elements of the application and its operating environment in an object-based manner. The environmental elements require some novel abstractions, simple fault tree analysis, and heuristics. A high-level information ow analysis is then applied to this model through a simulation-based approach. This dynamically de nes the assets and indicates where vulnerabilities exist. From this information, safeguards can be designed to address the vulnerabilities. These are also modeled as objects, but at three distinct levels of abstraction, allowing us to readily separate the security architecture from the application, to assess the eects and interrelationship of safeguards, and to readily explore alternative designs. Although the application modeling is speci c to the target application, much of the support environment modeling can be done generically, tailorable to speci c target environments by simply setting characteristic variables. Thus, small application developers can make use of environmental models that can evolve over long periods of time, representing the collective security expertise of many individuals, essentially becoming a security knowledge base that the developer can plug directly into his design process. The following section discusses the signi cance of this research. The remaining sections are an overview of the dissertation, followed by a road map to the remaining chapters.

1.2 Signi cance of this Research

This research did not proceed by addressing an already well-understood problem. Both the problem and corresponding lack of adequate tools were discovered as a result of the original intended research, resulting in the need to propose a new design methodology. As such, the signi cance of this work encompasses three aspects.

1. The precise de nition, explication, and characterization of the fundamental security design problem, its underlying causes, and tools to address it.

3 A necessary rst step in solving a problem is to clearly understand it. The security design process is analyzed, identifying the fundamental diculties, and exploring these in the context of the small developer with limited resources and other pragmatic constraints. This leads to a characterization of the needs, as well as the tools that would be desirable for a well structured, well documented security design process.

2. Classi cation and analysis of current security design approaches and their relationship to the above needs, including a detailed case study of our experience employing a risk management paradigm.

Although much is understood about many speci c aspects of security, little discussion has been devoted to the actual design process in the literature, with no eort to attempt any such classi cation. From our classi cation and analysis, it is evident that very few structured approaches have been documented, and of these, the risk management paradigm appeared most directly applicable to the problem. A case history of our attempts to employ this and other related techniques is presented. Critical analysis of this experience provides a better understanding of the necessary tools and how they might be realized, and provides a foundation for the design methodology.

3. The description and analysis of our proposed security design methodology, which addresses many of our goals and provides a solid foundation for future research in this important, but neglected area.

A new design methodology is proposed, which addresses many of needs discovered above, providing tools for a systematic approach to the design and maintenance of a security architecture for a distributed software application. Through analysis of numerous examples, its strengths, limitations, and directions for necessary future research are discussed. The methodology addresses the full range of issues involved in the security design process. This includes high-level security requirements, determining where security problems exist, understanding the impact of security mechanisms as they are designed, determining which problems have and have not been addressed, allowing one to readily explore alternative designs, providing the ability to build on the security architecture in the future, and incorporating security issues related to software, hardware, the physical environment, and personnel. Due to the extremely large scope of this undertaking, we could not focus in depth on the individual aspects. However, there was no alternative but to address the \big picture," since this is precisely what is lacking in the literature. For decades the security community has been undertaking detailed, focussed research into speci c areas of security, while largely ignoring the design process by which this can all be integrated in a structured manner. During the development of this methodology, all concepts were extensively evaluated in the context of the existing URSA system, which is complex enough to well represent this class of distributed systems. This included extensive walkthroughs of all concepts, modeling all of the fundamental URSA modules and functions, detailed exploration of all security issues that came to our attention, proof of concept programming where walkthroughs were not conclusive, and the implementation of numerous security mechanisms for the URSA system. All of this was essential to the creation of a viable methodology.

4 Given the scope of the problem and our limited resources, it was not possible to produce a full implementation of the necessary tools. However, it is believed that the groundwork has been clearly laid for this next step. All of the above contributions should provide a necessary solid foundation for future research in this important, but neglected area.

1.3 Need for Security in Distributed Systems

Many organizations are becoming increasingly aware of the bene ts of networking with other organizations or systems, and more and more applications are being designed to run in a distributed environment. The need for security in such distributed computer applications is increasing for many reasons. The costs of security breaches can be considerable, both in monetary and human terms. For example, the estimated cost to the National Security Agency of changing a single compromised code word was a quarter million dollars [15]. Although that is an eccentric example, consider the personal damage and lawsuits that could arise from unauthorized release of medical records, or the legal judgement errors that could occur, or mistrials that could be declared, if legal information was found to have been compromised. Further, without adequate protection mechanisms, organizations are generally unwilling to share even nonprivate information, especially with the government, for fear of also releasing private information. This can result in tremendous duplication of eort and ineciency [201]. Certainly, the growing interconnection of systems and the growing amount of information being committed to these systems can only increase the possibility of damage, especially from purely malicious attacks such as worms or viruses. Such attacks can undermine the entire cooperative foundation on which many eorts are built [191].

1.4 The Security Design Problem

In this section we discuss the fundamental diculties in security design, the additional constraints faced by the application developer, and the resulting needs for security design tools. These issues are discussed in detail in Chapter 3.

1.4.1 Fundamental Diculties in Security Design

A fundamental diculty in designing a secure system lies in the initial task of moving from a set of security requirements to a design which enforces them. The reason is that each security requirement actually embodies what we have termed both a positive and negative requirement. For example, in the simple requirement that \all users must login," the positive requirement is that some login mechanism is required. The negative requirement is that there be no other way to enter the system, and further, that the login mechanism itself be protected from tampering. Not only must the design satisfy the obvious (positive) functional requirements, but it must also allow nothing else to occur which could negate those functions.

5 This greatly complicates the normal process of system architecture design, since enforcing the negative requirement can involve many other areas of the system far removed from those supporting the positive requirement. For example, whereas the above positive requirement could be met with a simple login program that accepts a name and password, protecting that login mechanism may require that a wide range of issues be addressed, such as encryption of network messages, employing a trusted kernel, enforcing speci c access controls on the les involved, and enforcing controls on the life cycle of the software. Furthermore, the negative requirements will often necessitate a more complex, less intuitive design to support the positive requirement as well. For example, the simple name/password scheme may have to be changed to a challenge/response approach involving a remote authentication server. Consequently, not only is it dicult to design for the negative requirements, but it can be very dicult afterwards to understand the reasons behind a given design. What security purpose does each piece serve, and indeed, which pieces are even part of the security architecture? More importantly, it can be extremely dicult to determine the level of security attained. This is exacerbated by the following fact. If something is overlooked or done incorrectly in a general application design, the end users will generally complain. If something is overlooked in the security design, it is doubtful that an attacker will intentionally let us know.

1.4.2 Concerns of the Application Developer

In many applications, the developer is faced with pragmatic concerns that further complicate the task of addressing the security issue. Necessary resources, such as personnel, development time, and expertise, may be very limited. Security may well play a secondary role to other attributes of the application, such as ease of use, performance, and maintenance. It may have to be added in an evolutionary manner, as best it can. End-user needs can also vary widely, from security requirements and physical environment, to customized applications. Lastly, the security architecture must be maintained as the application and end-user security requirements continue to evolve and as development personnel also change. These issues all add to the cost of security. Thus, the goal of the designer will probably not be to provide some ultimate level of security, nor even to conform to some government security guideline or standard. It will more likely be to simply provide as much eective security tailored to the end user needs as is possible, given all the other considerations; to see how much security he can provide for how little cost.

1.4.3 Security Needs of the Application Developer

The combination of the application developer concerns with the general security design diculties creates a major problem for software developers attempting to address the security design issue in a structured, predictable manner. They will need design tools that allow them to:

Systematically discover where security problems exist, and why, at any point in the design process.

6 Readily determine the eects of security mechanisms as they are designed. Explore the viability of alternative designs or the eects of altered requirements in a systematic manner. Maintain records so the security architecture can be later understood and readily adapted as changes occur in the application, or in the security requirements, especially as development personnel also change. Achieve these goals with limited resources.

Essentially, we are suggesting no more than one already expects in conventional software development; the ability to clearly see the required functions, to know which have and have not been addressed, to experiment with optional designs, and to document the work. It is the negative requirement aspect, and the resulting complexity that exists in a security architecture, that complicate these issues. However, for the application security developer, these capabilities are not luxuries. Without them, the design process simply can not be approached in any systematic, predictable manner.

1.5 Current Lack of Design Tools

In developing a security architecture for the URSA system, we faced the same problems and needed the same tools as discussed above. Given the wealth of information available on many dierent aspects of computer security, such as research eorts in cryptography, network security, user authentication, and formal methods, as well as government guidelines and commercial products, we hoped to simply integrate this information into our needs. However, we discovered that the process of designing secure systems has not been well investigated at all, and that current methods of designing secure systems simply do not provide for the above goals. Essentially there has been a general lack of attention given to the fundamental issue of how one generates the design of a secure computer system in a systematic, predictable manner. The focus has been more on the individual pieces (e.g., cryptography or authentication), or after-the-fact analysis (e.g., veri cation), than on the process by which the system is designed. A similar situation would exist if general software development were described only in terms of the software tools to write programs, or the methods of their formal veri cation, rather than the design methodologies employed to create the programs in the rst place. This is discussed in Chapter 4. In the end, the only structured approach to the design process is that of classical risk management. This attempts to provide structure through the systematic determination of assets, evaluation of security threats and vulnerabilities, and the design of safeguards to address these. This is discussed in Section 4.7 and Chapter 5. However, this does not apply well to detailed software design, especially in dynamic distributed systems where assets can rapidly change form and location. Even recent attempts to extend the risk management approach for software issues have severe problems. A detailed case study of our attempts to utilize these techniques in the URSA security design, and the resulting diculties, is presented in Chapter 6.

7

1.6 A Proposed Solution

As a result of our eorts to employ existing design approaches, it became apparent that new design tools were required. The developer essentially needs tools to provide the same basic services that traditional risk management approaches attempt to provide, but which will work in this dierent environment. We propose a new design methodology that addresses many of the above goals, providing tools that allow the security architecture to be developed and maintained in a systematic, predictable manner. The framework of the methodology provides the same structured approach as risk management; assets are determined, security vulnerabilities located, and safeguards designed to address them. However, the methods and tools we use to realize this process are very dierent. First, all elements of the application and its operating environment are modeled in a layered, object-based manner. Although mapping the application to these models is relatively mechanical with simple abstractions, the environmental element models require some novel abstractions and simple fault tree analysis in order to integrate with our needs. A high-level information ow analysis is then applied to this model. After considering a number of options, a simulation-based approach was chosen, employing heuristics to maintain tractability. This analysis automatically and dynamically de nes the assets and indicates where vulnerabilities exist. This eliminates the need to manually list assets, trace operations, or enumerate vulnerabilities, and automatically adjusts to changes in the application or security architecture. From this information, safeguards can be designed to address the violations. These are also modeled as objects, but at three levels of abstraction, allowing us to readily separate the security architecture from the application, to assess the eects and interrelationship of safeguards, and to readily explore alternative designs. Of course, one must still design the security mechanisms, based on an understanding of general security design techniques, or know to employ previously existing designs, but the tools provide a clear indication of what needs to be addressed, how the mechanisms interrelate, and the eect of changes to the security or application architectures. Note that although the application modeling is speci c to the target application, the support environment modeling can be done generically, tailorable to speci c target environments by simply setting characteristic variables. Thus, small application developers can make use of environmental models that can evolve over long periods of time, representing the collective security expertise of many individuals, essentially becoming a security knowledge base that developers can plug directly into their design process, regardless of their security background. Details of the methodology are discussed in Chapters 7, 8 and 9, with a complete summary in Chapter 10.

1.7 Road Map to this Dissertation

1. Introduction 2. Background - Background on information retrieval systems and a brief overview of related work in security and distributed systems.

8 3. The Security Design Problem - Address rst goal. The precise de nition, explication, and characterization of the fundamental security design problem, its underlying causes, and tools to address it. 4. Current Approaches to Secure System Design - Begin second goal. Classi cation and analysis of current security design approaches and their relationship to the above needs. Summary at end. 5. Risk Management Techniques - Continue second goal. Describes the foundations of risk management and de nes terminology. Necessary background for following chapter. 6. Attempts to Employ Risk Management Techniques - Finish second goal. A detailed case history of our eorts to employ risk management and related techniques in the design process, evaluating the strengths and weaknesses, and gaining insight into what is needed. Summary at end. 7. A Security Design Methodology - Begin third goal. The proposed security design methodology. Introduces the ow analysis approach and policy issues, and describes how all elements of the application and its environment are modeled. 8. Calculating the Information Flow Analysis - Continue third goal. Discusses the problems with realizing the necessary ow analysis in practice, and the details of the chosen simulation-based approach. 9. Designing Safeguards - Finish third goal. Describes the method of designing safeguards, modeling them, and including them in the analysis. 10. Summary of Design Methodology - Summarizes the entire design methodology. This may be read prior to the previous three chapters as an overview, if one accepts that much will be out of context. 11. Conclusions and Future Work 12. Appendices - Covering the URSA system, security policy models, security design principles, and speci c issues for Chapter 6.

CHAPTER 2 BACKGROUND ISSUES In this chapter, distributed information retrieval systems, including the URSA system and associated security needs are discussed, since this research was undertaken in this context. A brief historical perspective on computer security is then presented, followed by a broader look at the wide range of issues related to this dissertation with references to later sections containing more detailed discussions.

2.1 Information Retrieval Systems 2.1.1 What They Are

Information retrieval systems allow the storage of large amounts of full-text information, such as scienti c journal articles, and the subsequent searching for documents that pertain to a desired area (see [175, 108] for an introduction to this topic). Unlike database management systems, the data in information retrieval systems are less well-/-structured. Rather than rigidly-de ned elds, the text database consists of loosely-speci ed contexts, such as title or abstract. Queries generally consist of boolean combinations of words or phrases, with possible context and proximity constraints. For example, one may search for documents containing the phrase nuclear power. If the number of returned documents appears too small, one may change the query to search for the words nuclear and power within the same sentence or paragraph. However, to avoid missing relevant material which does not employ that particular choice of words, one may expand nuclear to the logical union of (nuclear OR atomic OR fusion) and power to the union of (power OR energy OR plant). Searches generally proceed in an interactive manner, observing results and modifying query expressions, to focus in on the relevant documents. A practical system may need to store a large amount of information. For example, the collected laws of Utah require approximately 20 megabytes (MB) (or at least 40 MB with court decision annotations). The text of all court decisions would take tens of gigabytes (GB). It is estimated that the retrieval systems operated by Mead Data Central, including legal, news, and patent information, contain over 100 GB [190]. The large database size, coupled with the need for a more complex search through less-structured data, complicates the implementation of an information retrieval system with good response time. Parallel searching of backend databases, and

10 parallelism through distributed design, can help. Often, the physical distribution of databases and users mandates a distributed design in any case.

2.1.2 The URSA System

The target of much of this research was an existing information retrieval system. The Utah Retrieval System Architecture (URSAtm) is the result of a research project in the Department of Computer Science of the University of Utah, begun in 1983, to develop an information retrieval system based on high-powered workstations and distributed computing [109]. The distributed approach separates the processing into the user interface, which handles the initial query parsing and the display of retrieved documents, and the database servers, which locate the desired documents based on the query and return them for formatting and display. The user workstations' window-management system gives a simple, yet powerful, user interface, while distributed processing provides a clean interface between the various system modules and allows its expansion as database sizes increase. The URSA system is a message-based, large-grain, loosely coupled system, based on a client-server paradigm, and distributed at the process level. It was designed to run in a heterogeneous computer and network environment, with specialized search hardware, and has been ported to Apollo, HP, Sun and VAX machines. It also supports dynamic system recon guration; its processes can be redistributed across multiple machines and networks, while the system is running and in use [210, 211]. This proved to be an excellent target since it was a real system designed for real users, and it was complex enough to represent a wide range of distributed system requirements. For example, it employed both RPC and asynchronous message passing, end user machines were expected to range from simple PCs to powerful workstations, networks could range from LANs to WANs, and a typical system could easily involve over a dozen distributed cooperating modules. A brief overview of an URSA implementation and its basic functionality is given in Appendix A.

2.1.3 Security Needs in Information Retrieval

A distributed information retrieval system can be employed in a wide variety of end uses, with greatly varying security requirements. Three examples follow. In the United States Patent and Trademark Oce, anyone is allowed to access previously issued patents (this access is at the heart of the patent system), whereas applications must be held in con dence and can be accessed only by Patent Oce examiners. Medical information needs to be accessed by dierent users in dierent ways. Machine-readable copies of journal articles or other public information can be searched by anyone. Patient records identi ed by name or containing certain speci c information must be accessed only by the physician or nurses currently involved with this patient, which may span more than one medical institution. Laboratory technicians need access to the information regarding the speci c tests they are conducting. Pathologists or radiologists, perhaps from across the country, may need access to limited, or statistical information on a large number of cases, in order to compare a speci c case against past diagnoses, or for research purposes.

11 Lawyers need to access very large databases containing all the decisions of the courts of appeal and the Supreme Court, as well as the current laws and regulations. By its nature, this information is publicly known, and many users within a large geographic area may share the use of the same central database. However, lawyers may also require more restricted information. For instance, they may want to search all the past memoranda of their rm (to determine if representing a new client would pose a con ict of interest, or to see if a similar case had previously been handled) or access the evidence collected as part of the pre-trial discovery proceedings. Access to this type of information must be strictly controlled. Equally important is maintaining the privacy of the individual lawyers; one should not be able to arbitrarily observe the actions of another. Some mechanism is required to control the simultaneous access to information by these dierent sets of users. A multilevel secure (MLS) system is one which contains information with dierent sensitivities, permits simultaneous access by users with dierent levels of clearance, and prevents users from accessing information for which they have no authorization. Note that \information" may include almost anything in the system, such as the database, administrative les, other users queries, other users actions, or the results of other users queries.

2.2 Overview of Related Work

In this section, a very brief overview of related work is presented, primarily to provide perspective on the many dierent areas involved. Those areas of direct importance to this research are discussed in later sections. The general evolution of computer security is rst discussed, followed by the wide range of areas relevant to this research.

2.2.1 Brief Historical Perspective on Computer Security

Physically secure systems have been designed for thousands of years, whether simply a re at the mouth of a cave, a castle with a drawbridge, or a cli dwelling accessible only by ladder [102]. Even the problem of providing security in computer systems is certainly not new in relation to the age of computers. The following quote could have been written yesterday, though it appeared over 20 years ago in 1972 [132, p. ii]: \Suddenly, everyone's concerned about computer security. It's the number one topic in technical EDP journals..." In the 1960s, eorts were devoted to securing single processor operating systems for time sharing purposes[177]. Initially this was not so much to secure the data as to maintain the integrity of the system, as with memory protection. The increasing use of computers led to increased investigation in the 1970s [132], mostly driven by the military. Penetration testing techniques and tiger teams were providing a large knowledge base of known security problems, but were also demonstrating that systems were not very secure [200, 134, 12, 11], as did the well-publicized rash of computer crime in the '70s [13]. In response, the Air Force sponsored several studies to design and verify secure multilevel operating systems [200]. The methodology that grew was founded on the security kernel concept, which was based on the reference monitor model of

12 security [2]. These operating systems, such as [183, 78, 167, 84, 93, 141, 27], were based on more formal approaches to design and veri cation. Conventional database systems were also addressed, as with integrity issues, or the inference/aggregate problems in statistical databases[60, 79]. Also in the 1970s, with the onset of networking, new eorts were launched toward secure communications [196, 111]. However, it was not until the 1980s that the integration of network and computer security research began to take place, driven mostly by the onset of interconnected, or distributed operating system environments. This presents a whole new set of problems [43], and this work is still young. Only recently have there been serious attempts to deal with security in general applications (e.g., [10]), especially those that are distributed, and only very recently have others begun to express the need for better approaches to this problem (e.g., [59]). This appears to be the next logical step in this evolutionary process as evidenced in the welcome note of the 1988 National Computer Security Conference (a landmark yearly event sponsored the National Computer Security Center and the NIST) [40, p. i] (emphasis added). Our challenge is to build upon the foundations we have established so that secure applications emerge. We must understand and record how we build on these foundations in order to secure user-based systems.

2.2.2 The Bigger Picture

Our research was strongly in uenced by activities in many dierent areas, both within the security eld and outside, including information retrieval, distributed systems, and data communications. We began research in large-scale distributed information retrieval systems in 1983. This resulted in the development of a number of prototype retrieval systems, culminating in the URSA production system currently in use. During this time, a substantial amount of research was undertaken in distributed systems and data communications issues related to this application. Refer to [109, 211] for more detail. There is a large amount of information available on many aspects of system security. These range from current research eorts to commercial products, from ad hoc rules of thumb to government standards and guidelines, with targets ranging from isolated security mechanisms to certi ed secure operating systems and applications. The intended focus of the work ranges from computer science or mathematical research, to system operations and management. All of this work has some impact on the design of any secure system. Clearly, for our purposes, eorts which in some way dealt with a methodology for the design of secure systems were of primary importance. This includes formal methods, methods based on standards or guidelines, and risk management approaches, as well as generally employed ad hoc methods. These are discussed later in Chapter 4. A secure system design will also build on a number of dierent security techniques, speci c to a number of somewhat disjoint areas of study. Although these

13 may not aid directly in the design methodology, some will play essential roles in any successful design. These include cryptographic techniques, operating system and network security techniques, user authentication mechanisms, database security efforts, formal security policy models, penetration testing, operation and management techniques, and various government standards and guidelines, as well as commercial products which help to achieve any of these goals. Although we can not discuss all of these here, those that are related to design aspects are discussed in later sections. However, these are only the building blocks of a secure system, much like the algorithms in Knuth's books (e.g., [123, 124]) or in Software Tools [121] are the building blocks of programs. The issues we are concerned with are the how and why of putting them together. There is also much to be learned from the general analysis of existing secure systems. A large body of generally accepted security design principles has emerged, based on the results of prior attempts at developing secure operating systems and applications. Those that we have found useful are collected in Appendix F.

CHAPTER 3 THE SECURITY DESIGN PROBLEM In this chapter the rst goal of this dissertation is addressed, the de nition and characterization of the fundamental diculties with security design, especially for the small application developer with practical constraints such as limited resources. We begin by analyzing the problems with designing security into any software system, especially those that are distributed, and de ning the fundamental issues. This is followed by a discussion of the additional constraints faced by the small system developer and how these further complicate this already dicult task. From these analyses we specify the types of tools needed in order to undertake the security design process in the desired systematic, predictable manner. Since this entire issue has received virtually no attention, we felt it important to employ extensive examples and design scenarios throughout this chapter, to solidify the concepts. The reader wishing to quickly move on should read Sections 3.1, 3.2.4, and 3.3.

3.1 Fundamental Diculties in Security Design

An initial problem in designing a security architecture is in mapping one's security requirements into a high-level design. As discussed earlier, whereas the goal may seem relatively straightforward, the security requirements usually have far reaching implications due to the positive and negative requirements. For example, the simple requirement that \all le accesses be mediated," actually embodies two requirements. The obvious one is that some mechanism to mediate le access is required. This is the positive requirement. The less obvious and dicult one is that the system not allow anything else to negate that mechanism. That is, there must be no other way to access the les, and equally important, the le access mechanism itself must be protected from attack. This can involve many other areas of the system design, far outside of the access mediation mechanism. In this sense, it feels like one is working from a negative requirements speci cation. This problem was succinctly stated at the 11th National Computer Security Conference [185, p. 17], where a \secure system" was de ned as, \One which does what you want it to do and never does what you do not want it to do." Similarly, in the Trusted Computer Security Evaluation Criteria [67] there are two types of testing requirements. Functional testing determines if the system meets its speci cations, while security testing attempts to show that the system not only

15 does what it should, but that it does nothing else [24]. The following examples attempt to clarify the positive/negative distinction, and the resulting issues.

3.1.1 Examples of the Diculties

The following examples demonstrate the fundamental diculties in mapping high-level security requirements into a security architecture. These examples involve 1) simple le access controls, 2) user authentication and protection issues, and 3) securing against Trojan horses.

3.1.1.1 Example 1: Simple File Access Controls

Assume that the goal is to build a simple le access control mechanism. The security requirements might be stated (in an overly simpli ed way) as: 1) all les must be labeled with a classi cation, and 2) all le accesses must be mediated by comparing this classi cation to the clearance of the user. A rst approach may be to build a set of new le stream calls, in user space (NewOpen, NewRead, NewWrite, and NewClose), which handle these requirements, and which are implemented with the underlying operating system (OS) calls (open, read, write, close). However, there is nothing to stop a user from simply calling the underlying OS calls directly, to circumvent the controls in the new calls. Although the positive requirement was met, the negative one was not. The requirements speci ed that all le accesses must be mediated, not just some. How can we attempt to mediate all le accesses? One way is to build the new call semantics completely into the normal OS calls, stopping the above circumvention. However, the negative requirement may still not be met. If the system supports mapped memory, the user may be able to simply map the le into his address space, since the mapping calls may be independent of the stream calls. If the system supports dynamic binding and shared libraries, the user may be able to simply replace the appropriate library le with one containing the original stream calls. Of course, if the lesystem media is removable, the user can simply mount it on his own system to access the les. In a distributed environment, additional possibilities exist. For example, the OS may allow dierent types of le access via the network, the protected system may be booted \diskless" o the intruder's own system, and the data accessed through his own operating system, or the intruder may simply be able to enter remotely as superuser. Clearly there are a number of issues to consider. The key question is; by what predictable method could we discover all of these negative threats? The above issues touched on a very wide range of system issues. How can these be seen from the initial two simple requirements? However, an equally important question is; how can we assess the level of security achieved at any point in the design? How do we assess what has, and has not been addressed in a given security architecture?

16

3.1.1.2 Example 2: User Authentication and Protection

In another example, assume we have employed simple passwords for user authentication between nodes on an Ethernet. As the network expands, it is decided that the risk of passwords being stolen o the Ethernet is too great, so we change to a one-time-password generator. The remote system issues a unique challenge to the user, who enters it into the personal password generator. This generates a unique response which the user types in. Since challenges are never repeated, and challenge/response pairs are unique, network visibility of these authentication tokens is not a security problem. However, we later realize that someone could be impersonating the remote machine, issuing random challenges, and simply accepting any response. The user may be authenticating with an intruder! To remedy this, a two-way authentication mechanism is implemented, where both ends now know who the other is. However, it is possible that someone is in-line with the user's connection, recording a log of all their transactions. Thus, an end-to-end encryption mechanism is now built, to guard against this latest threat. However, what if a Trojan horse program resides on the user's node, or the remote system? The encryption may be worthless since the intruder program may act on the plaintext. The discovery of possible threats and remedies can go on and on. As in the previous example, how can we systematically discovers all of these threats and assess the level of security achieved? How can we even know what the individual pieces of the security architecture are each doing?

3.1.1.3 Example 3: Securing Against Trojan Horses

Assume we wish to address the problem of unauthorized software modi cation on a remote user workstation, and that we trust our authorized users. That is, we are concerned with modi cation only by unauthorized personnel. The following security con guration may provide a reasonable degree of protection. (1) Allow a local disk on the user node, and (2) secure the software installation on that node. Thus, we know the software is correct initially. (3) Employ a trusted node/network interface unit that allows communication only with secure backend nodes. This prevents a remote network-based attack. (4) Finally, physically secure the node in a locked room, to which only trusted personnel have access. Thus, the only physical access to the node and disk is by trusted users. Although this may work, these security controls are certainly guarding much more than the modi cation of software. They are dealing with physical room security, network security via special hardware, and assumptions of user trust and of the software installation process. Further, they all depend on each other to achieve the single security requirement, in this case, \no unauthorized modi cation of software." The same questions apply again; how to discovers the threats, how to determine the level of security achieved, and how to understand what the security controls are each doing.

17

3.1.2 The Lack of a Design Methodology

The above negative situations are not normally encountered in mapping requirements into a design speci cation. Of course, in the design of any robust system, one must handle exceptional conditions. However, the interface with the external environment is usually through well-de ned channels. For example, one can design a program to act on the error returns from each system call, or to handle any form of user input or garbled data communication messages. However, in the security arena, the adversary often determines the input domain. For example, one does not expect the system calls to return incorrect data without an error indication, or a user to single step a program in the debugger and change the values of variables while it is running, or to simply use a personal program to access the data. Or, although one can design a communication system to recover from garbled messages, it will likely not work properly if the messages have been maliciously altered (e.g., keeping the checksum intact) or completely forged. Few systems can handle this type of nondeterminism, and those that do, such as fault tolerant designs, are dicult to write. This is closely related to the problem of software safety as discussed in [130]. Certainly, as we have seen in the examples, each form of attack can probably be countered. That is not the issue. The issue is that in each case above we started o with a very simple requirement specifying a function the system had to provide. However, in order to design the mechanism to achieve that function, we ended up on tangents, most of which had no direct relation to the original requirement. Even for these very simple examples the security controls permeated a wide range of system aspects. This situation is only exacerbated in larger systems with more complex security requirements. By what method can we hope to see these other aspects in a predictable manner? The fundamental question is how to methodically nd the problems in the rst place? How do we create the universe of possibilities that the negative requirements must address (or, more realistically, a representative subset of that universe)? A companion question is how can the resulting security designs be analyzed? It is dicult to know which threats have and have not been addressed when one can not de ne the set of possible threats in the rst place. How can one assess the eect of removing a particular security control, or if the addition of new controls obviate older ones, or if one set of controls addresses more threats than another? Thus, not only is the initial design process dicult, but it is equally dicult to determine the level of security achieved by any given design. What can we say about how well the original simple requirements have been met in any of the above examples? We can not tell where we are or where we should be going. Although it may be possible to determine how well the positive requirements have been met, understanding how well the negative ones have been addressed is not as straight forward. This is a very dierent situation from that encountered in many other types of software design. In conventional application design, one can often succeed by having a clear idea of the end goals and then following a reasonable development methodology, mapping requirements into necessary functional modules. The system is built, the users and testers will report problems, and as long as the architecture is well structured and the original design goals were on target the problems can be readily xed.

18 We developed the URSA system around a message based architecture, following modular decomposition, and employing a layered design for support services. Bugs and missing features were reported and xed, and the system quickly evolved into a reasonable product. This is generally not the case for security design. As we saw above, the mapping from initial requirements into a security architecture may not be obvious, and it can be dicult to see the correspondence even for the positive requirements. As for the negative requirements, general users will not exercise the security functions, only attackers will, and they are not likely to report any oversights. Further, although security elements can cause an application to break through over-control or malfunction, oversights in the security system will probably not do so. In fact, each oversight is generally one less place where the security mechanisms can break the application, and thus each one lessens the chance that it will be noticed. For these reasons, penetration testing (Section 4.3) has commonly been employed to nd security aws. However, this takes additional resources which may not always be available, whether via formal means (e.g., trained tiger teams or certi cation eorts), or informal means (e.g., a large user base with sophisticated users, and possibly hackers to detect aws, such as the Berkeley UNIX environment in universities). Regardless, this still does not help in the initial design. The \after the fact" analysis is only useful once we have a design to start with. Our concern is how to arrive at the design initially in a methodical way. We need to have a clear idea of where the problems exist in the rst place, in order to actually drive the design process, as well as needing a clear indication of how well our design is achieving our security goals. Of course, secure systems have been, and continue to be successfully designed. Our understanding of many aspects of secure system design is growing, and each year more systems which address security issues appear. However, the process by which these secure designs are arrived at, the process of mapping one's security requirements into a security architecture in a systematic, predictable manner, is still not well understood. Although successful designs can be achieved without such a process, one would certainly not argue the desire for such a process. Further, as we will see later in this chapter, the ability to design in such a controlled manner is extremely important, if not mandatory, for many application developers.

3.2 Concerns of the Application Developer

In many applications the designer is faced with a number of pragmatic concerns that further complicate the task of addressing the security issue. Some of these are especially true for smaller companies with limited resources, but many can be true for any application. In addition to the developer's own concerns, the end users' requirements can be expected to vary substantially, both in terms of the physical environment and their security needs. The combination of these practical constraints with the general security design diculties make the developer's job much more dicult than one might expect from many theoretical or ideal treatments of the security design process.

19 In the following three sections we provide examples of these three concerns; the practical considerations of the developer, the end-user physical environment, and the end-user security needs. Following that we present some example design scenarios which demonstrate how these issues can impact the security design process.

3.2.1 Practical Considerations

There are many practical considerations faced by the developer with limited resources and existing applications which must be secured. Some examples include: Necessary resources may be very scarce, such as personnel and development time, as well as expertise in the area of secure systems. In the case of previously existing applications, the designer will not have the luxury of redesigning it from the ground up to build security in as it should be, but will likely be forced to add security in an evolutionary manner. Even though security may be important, it may not be the primary objective. Other attributes of the application may well take priority, such as ease of use, performance, and maintenance, which can be adversely aected by security measures. Security may have to be \ t in" as best it can, minimizing these costs. The security architecture must be able to readily accommodate changes; products must not be expected to stagnate just because security is added. The ability to customize applications for dierent end-user needs may be a primary selling factor for the product. The security architecture may have to accommodate such dierences. Personnel will come and go, and clear design documentation is needed to aid in the maintenance, enhancement, or modi cation as the application changes over time, especially if security is already a low priority aspect of the system design.

3.2.2 End-User Physical Environment

The end-user on-site physical environment has a major impact on the security requirements, and is often out of the developers control. Some examples include:

The developer may rarely have the luxury of controlling the target environment, and will have to work the design into it. For example, a network may already be in place, and not physically secured, or of a lesser desired type or con guration. The entire network may initially be controlled by the end user, but later it may need to be extended to other departments. Or, the application may have to be built on a heterogeneous platform out of the developers control. For example, the customer may already own certain workstations or personal computers, or the cost of a particular machine may drop, and the developer is forced to change.

20 Some sites may be able to provide users with workstations, or personal computers that provide hardware support for memory protection (e.g., supervisor/user states), while others may have to live with wide open operating systems, such as exist on many personal computers. Some may even be able to employ veri ed secure kernel-based operating systems, such as Gemini [181]. Of course, any one environment may have any combination of these. Some environments may be limited to a single local area network, isolated from the outside, whereas others may exist within larger networks, which may span the globe, in which case misuse from outside is a very real issue. Physical security may also vary considerably, even within a given organization. Can server nodes be isolated in a secure room? How easy is it to replace or modify a given node, or to add another one to the network? How well are dial up lines secured? Physical security may change over time. A site with a single LAN limited to one department may later expand the LAN to other departments. How does this impact the security architecture?

3.2.3 End-User Security Needs

Speci c security needs of end users can be quite variable. Security is not a black and white issue. Some examples include: An on-line library service may only worry about protecting the integrity of the information, with little concern about who reads what. A medical information system would have both integrity and secrecy requirements, with the need for multiple, overlapping security partitions. This would be substantially more complex than the library requirements. Although one organization may be comfortable trusting the integrity of the application programs, another may insist on guarding against the possibility of subverted programs or data (e.g., Trojan horses or viruses). Some organizations may trust their employees to not maliciously attack the operating system, and in these cases, the system need only protect against accidents and external attack. Others may have no trust in the system users, and may thus require protection from determined internal attack. Some organizations may not care what a user does with information once it is obtained (e.g., the user may decide to e-mail it to a friend). Others may want to control such information until the time it actually leaves the system, for example, if it is sent to a printer or oppy disk drive. They may also want to post guards to control what is physically removed from the printer.

21

3.2.4 Some Security Design Scenarios

Following are some example design scenarios that demonstrate how these issues can impact the actual security design process. In these examples we are not even considering the fundamental problem of how one generates the initial security design, of how one systematically maps the security requirements into a security architecture. That problem has already been demonstrated earlier in this chapter. Here, we are focusing solely on the consequences of the other issues arising in the application life cycle. It should be apparent that without some form of structured design process, these kinds of situations will be extremely dicult to handle. The interdependencies of security mechanisms must be dynamically visible during the design process, since changes in one security mechanism may impact others. A communication link between a user node and backend server may have been physically secured to prevent password visibility. The later adoption of a challenge/response authentication scheme may seem to render the physical network security unnecessary. However, the physical security may have also been speci ed for data transfer protection, in which case it will still be needed. The later adoption of message encryption hardware may nally obviate the need for physical security. The designer must be able to readily see these types of dependencies. The designer may only be able to partially address a particular security problem, regardless of cost, due to technological limitations. However, the technological limitations may vanish in a couple of years, but unless adequate records were kept of the design decision, how will this knowledge be applied? How would one even know it was an issue? A new operation may be added to the application, perhaps only for a particular customer. The designer needs to be able to readily see if the current security architecture is still valid, or if new holes have been opened. If so, tools are needed to clearly show what they are, and aid in the necessary design modi cations. The designer must also be able to see the global security rami cations of the modi cations, to know that other aspects of the security architecture are not being inadvertently compromised. Any time the application or its security architecture is customized for a particular target, it would be useful to have tools to determine if this new design will still be secure in the other previous end user environments. If so, a complete retro t may reduce the costs of maintaining additional system versions. A previously designed security architecture may have required that a particular network link is physically secured against tampering, since all previous end user sites have been able to accommodate this. However, a new customer may be unable to comply with this requirement. The designer must be able to determine the impact of this on system security, and possibly undertake design modi cations to address it. If a retro t is possible, this may obviate the original secure network requirement.

22 Software maintenance is often a major cost in any application [171], and will only increase if the security architecture is not clearly understandable. For example, a new design engineer may notice an \obvious" security issue that appears to have been overlooked. It must be possible to determine if it had been ignored by choice (e.g., was not worth protecting), if it had been addressed in a way the engineer simply did not see, or if it truly had been overlooked. An end user who has previously been limited to a single local network with no external connections or even dial-up lines may later decide to connect to a public data network. The designer must now determine if this has any impact on the system security, and redesign if required. A new end user may decide that some existing threat, that had previously been acceptable to live with, or which had only been audited (due to the expense of stopping it entirely), must now be fully countered. However, without adequate records, the end user may not have even known it was there, let alone that it had not been addressed.

We previously saw the general need for a systematic, predictable approach to security design. Because of the above considerations, the designer in these situations is even more in need of such a mechanism. Tools are needed to help in designing the security architecture, based on the security requirements, the application architecture, and the existing security architecture (if any), at any point in time. Since these can all change, the designer needs to be able to easily reassess the situation and alter the security architecture as required.

3.3 Design Tools for the Application Developer

Although the above considerations can make the job of designing a security architecture more dicult, they do result in one major counterbalance. For most applications, the goal will not be to provide some form of \ultimate" or veri able security; it will not be to submit to the National Computer Security Center for evaluation; it will not even be to provide security that meets some government or industry guideline. In many cases, the goal will be to simply provide as much eective security, tailored to the end user needs, as is reasonable, given all the other considerations; to see how much security one can provide for how little cost. Further, security mechanisms are notorious for reducing system eciency, by slowing it down, making it cumbersome to use, or eliminating needed functionality. There is no desire to design more into a security architecture than is required. Based on the information presented in this chapter, we propose the following list of requirements for tools to achieve our goals: 1. The designer needs to be able to systematically discover where the security problems exist, and where security controls need to be enforced. 2. The designer needs to be able to readily determine the eects of security controls, as they are designed, and to clearly see which problems have and have not been addressed at any point in the design process.

23 3. Any software design process requires feedback at dierent stages. This is especially true in this situation. The security design process will involve decisions on alternatives and tradeos, or iterations with altered requirements, especially if cost eectiveness is critical. Even if the security requirements are very precise and stable, there may be many options or dierent constraints in designing systems to enforce them. The developer must be able to play what-if games with optional security designs, to easily \back out" bad ideas, and to see the global eects of the addition or removal of security mechanisms. 4. The designer must be able to keep records, so the security architecture can be extended, or modi ed, as the application or the security needs change, especially as development personnel also change. Without a clear understanding of why design decisions were made, of what has and has not been addressed, and of how the various security mechanisms interrelate, it will be very dicult to systematically build on this in the future. 5. Lastly, the designer needs to accomplish all of this with limited resources. Essentially, we are suggesting no more than one already expects in conventional software development; the ability to clearly see the required functions, to know which have and have not been addressed, to experiment with optional designs, and to document the work. It is the negative design aspect and the resulting interdependencies which exist in a security architecture which have complicated these issues here. It should be clear that we are not suggesting the need for such security design tools and aids simply from theoretical interest or as a luxury. We are saying that without such tools, the system developer in the above situations simply can not undertake the design process in a systematic and predictable manner. Some form of secure system may well be produced, but it will be very dicult to determine the level of security achieved, and very dicult to build on in the future.

CHAPTER 4 APPROACHES TO SECURE SYSTEM DESIGN This chapter is the rst step in addressing the second goal of this research, discussing current approaches to the design of secure systems and how these relate to the desired tools presented in the previous chapter. To understand current approaches, we rst had to create a representative categorization, which to our knowledge has not been done before. This is presented along with a detailed discussion of each approach and how it relates to our needs. These approaches are explored in depth to provide a solid understanding, since these have not been previously discussed in the literature in this context. This is followed by a discussion of recent work supporting our contention that adequate design tools are lacking. The reader wishing to quickly move on should read Sections 4.1 and 4.2, followed by the summary of this chapter in Section 4.12.

4.1 A Lack of Existing Design Tools

There is a wealth of information available on many dierent aspects of computer security in the forms of research eorts, commercial products, government guidelines and standards, and ad hoc design experience. Each of these applies to one or more phases in the design, development, testing, and maintenance of any system. It appeared initially that we would simply have to integrate the relevant approaches to achieve the goals of Chapter 3. However, integrating these for our needs was very dicult. It eventually became clear that very few applicable tools, or even research, existed. It also became clear that there has been a general lack of attention paid to the fundamental issue of how a developer with the previously mentioned needs can actually generate the design of a secure application in a systematic and predictable manner, not to mention maintaining it. Only recently have these needs even been mentioned in the literature. In most cases, an after the fact analysis of the security architecture takes precedence over the method by which the architecture was arrived at in the rst place. It was very rare to nd an article that described the actual security design process for a software system.

25

4.2 A Categorization of Current Design Eorts

As we found ourselves searching for design tools, we of course needed to fully understand how others were designing secure systems. However, given the lack of attention to the design process in the literature, there has been no eort to our knowledge to categorize or classify current secure system design approaches. Thus, one of the rst tasks was to produce such a categorization. We undertook a detailed investigation of how secure systems of all types were being designed, with literature ranging from management level risk analysis to formal veri cation eorts, in an attempt to distill the commonly employed design methods. Given the lack of documented eorts targeting this issue, this often required reading between the lines in the literature, and engaging in discussions with developers and researchers, in an attempt to understand how secure systems were truly being developed. Analyzing what was observed, we were able to group the various approaches to the problem into a set of broad categories which could then be analyzed somewhat independently. Most systems will likely be designed using some combination of these approaches. The categories are rst brie y described below, followed by a detailed section on each one. Note that this categorization represents only methodologies for designing secure systems, and does not cover the full range of related security research. That wide range of topics was discussed in Section 2.2, with additional detail on selected areas provided throughout this dissertation as necessary (e.g., Appendix B on security policy models, or Appendix F on security design tenets). Much of the diculty applying any of the techniques discussed in this chapter is due to the complex interaction of the security controls and the system design, coupled with the complexity of the target class of system and its dynamic, distributed nature. In later chapters we discuss a case history of our attempts to employ some existing tools to the design process, and our proposed design methodology. The severe limitations of current approaches will become more obvious within those contexts.

Find and Fix. As the name implies, security aws are patched as they are

found, analogous to \code and x" software design. This implies that one can somehow nd the aws. Preventive Design. This consists of applying available security techniques, but not necessarily in response to detected aws. This is similar to nd and x, but is not driven by aw detection. It is driven only by availability of security techniques, in the hope that putting them in will help. Formal Methods. These consist of developing a formal security policy model, specifying the system design at various levels, and proving the correspondence between the model and the speci cations. This is similar to other formal software development approaches. Guideline Adherence. This consists of designing the system to adhere to some security guideline, policy model, reference model, or standard. Risk Management. This is the classical security approach, generally con-

26 sisting of determining assets, assessing threats and exposures against them, and designing safeguards to counteract the exposures. Safe and Reliable Systems Design. There are both similarities and differences between the design of secure systems and these related eorts, which we explore here. Conventional Software Development. We already discussed how the negative requirements speci cation renders conventional approaches alone inadequate for the security design process. However, we are still designing a software system, and will thus make some use of conventional techniques. Experience. The system design follows a previous design or the experience of the designers. This is the sole approach in some systems, but also plays an obvious role in any design approach.

4.3 Find and Fix Approach

Probably the most common approach to achieving security in existing application systems is to simply patch security problems as they are discovered. This approach, analogous to \code and x" software design, is popular with both end users and system developers. From the end user perspective, many application security problems are due to inherent design aws which they have little or no ability to change. Patches are simply a last resort. This is especially true with operating systems. From the developer's perspective, they may lack sucient resources, such as expertise, time, or money, to mount a full scale security design eort, but still may be able to patch known problems. This approach can actually be very successful if a large experience base exists, with enough connectivity such that over the years the known holes and patches permeate the community. A good example of this is the Berkeley UNIX operating system in the university environment. General sources of information include personal communication with other users, network newsgroups, security mailing lists, user groups, and publications such as [207, 63, 22]. Regardless of particular needs, these sources can provide useful insight into general security problems and workable solutions. Penetration testing, the most common technique for evaluating secure systems in the '60s and '70s, also falls into this category. Well trained tiger teams would attack a system in a systematic manner to detect aws [12, 11, 204]. Systems Development Corporation re ned this process into the well known aw hypothesis methodology [134, 105]. The Department of Defense Trusted Computer Security Evaluation Criteria (TCSEC) [67] requires exactly this type of testing for certi cation. Penetration testing is still a very useful technique in detecting aws. The results of such tests also help to point out common system problems and attack scenarios that are useful to others, as in [134, 12]. A recent penetration analysis method and tool that employs formal techniques is described in [98], but this really falls under the heading of formal methods.

27 However, these approaches oer little help during the initial design phases, providing no method or structure for evaluating the solutions, understanding the level of security obtained, exploring alternatives, or building on the architecture in the future. The goal is usually to detect aws in an already secure system, rather than as an initial design tool. This is rather contrary to our goals. However, once a system is designed, feedback on security aws from the end users is very important (an inexpensive form of penetration testing).

4.4 Preventive Design Approach

The nd-and- x approach does not apply well in situations where access to information on security aws is limited. For example, end users may have no access to network news or mail, or a system may not have a large enough user base for articles or books to be written about it, or even to support an informal user group. In these cases, an alternative may be to simply apply available security technology in a preventive manner. This is similar to nd-and- x, but is not driven by the

aws. Of course, one can also nd-and- x as required later. The preventive design approach may even be employed if aw information is available, simply because it requires less eort. One can immediately install those mechanisms that are readily available, aordable, or currently well-understood, and can upgrade continuously as required, without a lengthy process of aw determination. As with nd-and- x, this may be all one can aord to do, and may be better than nothing. For example, users installing a new operating system on a personal workstation may elect to simply follow the instructions in the manuals or installation notes on security (e.g., [63, 9]). They may have no idea if this provides adequate, if any, protection for their particular needs, but the alternative may be no security at all. Similarly, application developers with very limited resources may elect to build on the security mechanisms of the underlying operating system, or to install currently available technology such as one-time password authenticators, network encryption devices, and software le encryption. Sources of such information include the comprehensive security references discussed later in Section 4.6.1, and current product literature and trade journals. They may, however, have little justi cation for what these eorts are actually securing against. This can be very loosely compared to randomly throwing mortar at a leaking dam. In the end, the leaks may be gone, but one has no idea of where they were, why they were there, nor how well any of them have been patched. This suers from the same limitations as the nd and x approach, with an additional drawback. Building on the security architecture in the future will be even more dicult since there will likely be little architectural integrity to start with. This can also result in a false sense of security, which may be worse than no security at all (e.g., Who would want to live downstream of that dam?). A good example of this was provided in [125, p. 1], \A Survey of Data Insecurity Packages." The abstract says it all, \Five commercially available encryption packages for the IBM PC are described and broken." Entrusting sensitive data to nonsecure mechanisms may be far worse than assuming no security at all, and then securing it by other means (e.g., by placing oppies in a safe).

28

4.5 Formal Methods

Due to the diculty of determining the level of security achieved in a given system design, and the limitations of penetration testing, research was begun in the early '70s into formal methods of evaluating secure systems. A large body of information has emerged, which generally falls into two categories; formal policy modeling, and formal system speci cation and veri cation. The concept is similar to any formal software development eort, with the goal of verifying the correspondence between system speci cations and the security requirements. One rst develops a formal mathematical policy model that accurately re ects the high-level security requirements of the system. Then, a high-level, formal design speci cation of the system is developed, and proven to correspond with the requirements of that model. Additional levels of more detailed speci cation can also be generated, and proven to correspond with the higher levels. Finally, correspondence of the actual code with the lowest level speci cation is demonstrated, or proven where possible. In the next section, we discuss the speci cation and veri cation issue, reserving the discussion of policy modeling for the following section.

4.5.1 Speci cation and Veri cation

Two of the earliest eorts to produce a formally veri able system were the Kernelized Secure Operating System (KSOS)[141, 27] and the Provably Secure Operating System (PSOS)[78]. Many other eorts have since taken place, such as in operating systems, database systems, and networks. Examples include UCLA Secure UNIX [199], the Honeywell Secure Communications Processor (SCOMP) [23, 84] (now certi ed at TCSEC Class A1 [67, 158]), the SeaView multilevel secure relational database system targeted at TCSEC Class A1 [203, 62, 61], the VERDIX Multi-Level Secure LAN [143], and [37]. Good overview discussions of the existing formal speci cation and veri cation techniques are given in[87, 48, 45], which also point to some of the applications in which they have been employed. The more prominent systems over the last 10 years are the Gypsy Veri cation Environment (GVE), the Formal Development Methodology (FDM), the Hierarchical Development Methodology (HDM), and AFFIRM (references to these can be found in the above citations). These speci cation techniques are actually methodologies, since in addition to the speci cation languages and tools (e.g., theorem provers, information ow analysis techniques, and code correspondence methods), they also recommend a design approach, such as viewing the system as an abstract state machine, or as a hierarchy of nested procedure calls (e.g., inputs, outputs, function) [87]. However, although this can provide a useful structure during the design process, it does not help at all in the mapping of security requirements into the initial design. In addition, formal methods can also help discover mistakes during the design process, if one incrementally veri es each new piece of the design as it evolves. This was pointed out in [203], where each error that was discovered provided more insight into the problem. Of course, they are also directly applicable to nal evaluation or certi cation. Again, although this can be very useful, we are still left with the problem of methodically, and predictably developing the design initially. We were

29 looking for tools to show where security elements are needed in the rst place, not just an after-the-fact analysis (as these tend to be [120]). In any case, these eorts have met with only limited success (this is certainly not an o-the-shelf technology [1]), and has proven to be extremely time and resource intensive [117]. Clear examples of the latter are the time frames of the earlier UCLA secure UNIX kernel development [199], or the LOCK TCB project at the National Computer Security Center [178, 180], the SeaView project at SRI International [203, 61], and the detailed analysis of the security testing of XENIX [55]. The complexity was demonstrated in a very simple way by a public request for formal design and veri cation of a very simple RS-232 software repeater [127] and the responses [128]. Such resources and expertise are simply not available to most small system developers. Certainly, if we believe there are tools to aid in the security design process, then they may well be based on some type of simpli ed (tractable) formal mathematical analysis. However, they would have to do more than just verify a given design speci cation. They would have to aid in the initial construction of that design. This is what we were searching for.

4.5.2 Formal Policy Models

Although formal speci cation and veri cation eorts are not directly applicable to our goals, the need for a clearly de ned security policy is mandatory in any design eort. A given computer system is only secure with respect to its enforcement of some policy[67, 46], which is de ned external to the system. Simply put, in order to prevent unauthorized release and modi cation of data, one must clearly de ne what is meant by \authorized." The high-level meaning of security can vary greatly, from the simple integrity requirements of a library, or the secrecy requirements of the military multilevel model [67], to the complex secrecy and integrity requirements of an integrated medical information system, or of a database system for a nancial market analyst who must not use insider knowledge in advising corporate clients [36]. Because of the importance of policy models, a historical perspective on modeling eorts is presented in Appendix B, giving an overview of the issues involved and the types of models which have been developed to address various needs. Our speci c policy needs are discussed in Chapter 7.1.3.2. In this section, we are concerned only with the role of policy modeling in the actual design process. The primary purpose of a security policy model is to express the high-level security requirements in the most simple, unambiguous, and general way possible [87], suppressing irrelevant detail in favor of abstraction for analysis [20]. It must accurately capture the meaning of security, since this is what the target system must be evaluated against. For this reason, and since most policy modeling efforts have been targeted at formal veri cation, the model must contain enough detail and be suciently related to the system design to allow for correspondence proofs or demonstrations. That is, it must provide the glue between the high-level requirements and the actual system speci cation. As such, policy models generally specify not only the high-level security requirements, but also encompass many aspects of the system security architecture.

30 Loosely speaking (as pointed out much more rigorously in [92]), one may view the model as representing the structure of the target system, whereas the policy aspects are just assertions that must be true about that model in order to achieve the desired security. The combination of these two aspects is to specify a system that achieves the high-level security requirements. Some examples may clarify this. In the Bell and LaPadula model [193, 21], the high-level requirements are those of the military security policy [67], the target system is modeled as a state machine, and the policy assertions include state invariants and constraints on allowable transition functions. In the SeaView MLS database model [203], the high-level requirements are much more complex, dealing with multilevel security in a relational database, the system is again modeled as a state machine, and the policy assertions include constraints on allowable transition sequences. In noninterference models such as [114, 115] or [92], the system is modeled as communicating state machines and policy assertions specify required properties of global traces (e.g., that certain traces not interfere with other traces). Since many models encompass aspects of the actual system architecture, they are essentially representations of an assumed high-level system design. Models are generally utilized to evaluate an existing design speci cation rather than to guide the design process itself. Certainly, such evaluation is important, as with the incremental veri cation of components as the design progresses [203]. But this is all after the fact. We need help in the initial design process. We are starting with no idea of what the security architecture should consist of. Furthermore, models often represent only a very limited view of the design. For example, they may clearly specify the required security properties and the security related behavior of the necessary security functions, but oer little to no help in understanding how to provide the required functions in light of the negative requirements. It is just assumed that the mechanisms are protected (e.g., that they will exist as part of some trusted computing base) and that they are not circumventable. However, if a trusted computing base is required, then we have to design this, with a target environment containing many variables such as secure versus nonsecure nodes and networks, trusted and untrusted users, and variable application needs. This is where we want help. The architectural requirements speci ed in the policy models oer no help in this respect at all. Furthermore, many important security requirements are usually not stated in the policy models at all. These include requirements for authentication, auditing, or assurance. They may be speci ed elsewhere, as in the TCSEC, but even there, they are merely requirements. There is no aid in designing the mechanisms to enforce these requirements. It was tempting to try and apply a particular existing model and associated high-level design, and build our architecture on that. However, this was not a reasonable approach for two reasons. Firstly, one of our primary goals was to be able to explore alternative designs in developing the security architecture, and understand the reasoning behind the choices. Simply employing an existing design contradicted this goal. Secondly, how would we ascertain that such a model would provide for our needs in the rst place? This is the basic problem we are trying

31 to solve. We wanted tools to help us understand what was needed and why. This issue is discussed from a slightly dierent perspective in Section 4.6. Certainly, once a preliminary design is produced, one may discover that it is similar to that described by a particular model, or one could attempt to produce their own policy model for this design. This would be useful in anticipation of a more formal approach to certi cation, veri cation, or even simply evaluation of the system. However, the design must come rst, and the existing models oer little help in that process. In conclusion, policy modeling eorts, although important in evaluating a secure system design, oer little aid in the system design process. They are analogous to a high-level functional speci cation; it describes the goal, but not how to get there. However, much can be learned from understanding existing modeling eorts, in terms of dierent types of high-level security requirements, ways of expressing those requirements in terms of a system model and policy assertions, and in the bene ts and limitations of dierent forms of models (e.g., state machine or information

ow).

4.6 Guideline Adherence

There are many dierent guidelines, standards, reference models, and general reference texts, which can all aid in the design of secure systems. Working within such frameworks can oer many advantages. These guidelines will often provide such information as detailed requirements for security, assurance, or documentation, enumeration and discussion of speci c security mechanisms, and architectural models. However, they generally oer little help on how to achieve these requirements in the type of structured manner we desire. Their limitations are similar to those discussed with policy models, even though these tend to be more general, and encompass more of the system. The dierent types of guidelines are discussed in the following sections.

4.6.1 Comprehensive References

There are many comprehensive books on general system security, which cover a wide range of issues, such as disaster recovery and contingency planning, safeguard selection principles, operational security, and personnel screening. If any design methodology is discussed at all, it is usually a risk management approach (discussed later in Section 4.7). However, these are generally targeted at the system management or administration level. Although these are useful for the intended target audiences, or to gain perspective, they oer little if any aid in software development. Some examples include [53, 50, 160, 52, 168, 186, 81, 129, 80]. Similar books have attempted to address more of the technical aspects of computer security, though often from a more limited direction such as network or database security. Examples include [110, 57, 51, 79]. However, it was not until the introduction of Gasser's book [87] in 1988 and others later, such as [164], that comprehensive texts appeared which dealt with the detailed technical aspects of software security. Although these are valuable references, they generally neglect the design methodology issue. Numerous isolated security issues and solutions are discussed, but with

32 little mention of how to actually proceed with the design (except perhaps for a risk management approach, as in [52]). This is common even in the earlier technical books. In general, the more technical the discussion, the less there is devoted to the design approach. Even if a design methodology is discussed, it may not be clear how it is to be applied in practice, or it may appear contradictory. For example, in [52], a risk management approach is discussed early in the book, but later, software security development is described as proceeding in terms of formal methods (formulate a policy model, design features to implement and enforce the model, and then verify the implementation), which bears no relation to risk management. However, in fairness, at least a design approach of some form was discussed. Even the framework in which the technical information is presented often appears haphazard (unless they are very narrowly focussed on a speci c security aspect). Most of these books tend to read as a disjoint set of issues, with little to tie them together, especially in terms of any type of design methodology. This was pointed out well in a review [116] of one of the better recent technical books on the subject [164]. The reviewer mentions that the \organization seems haphazard," that \related ideas reappear many times in dierent guises," and that the author \doesn't develop a cohesive framework or discuss theory systematically;" that he instead \inundates readers with ideas worth considering and events worth remembering" [116, p.150]. This is not the fault of the authors, but is due largely to the lack of attention that has been paid to the fundamental issue of how one actually designs a secure computer system. As we said earlier, the focus has been more on the tools used to build one, or to verify one, rather than the method by which it is built. A similar problem would exist if we attempted to describe general software development only in terms of the software tools to write programs, or the methods of formal veri cation, rather than the design methodologies.

4.6.2 Guidelines and Standards

There are a growing number of government guidelines and standards which apply to secure systems. The most visible of these in the computer science and technical security elds are those published by the DoD National Computer Security Center (NCSC) [154, 155, 67, 72, 66, 68, 70, 74, 71, 73], and the applicable Federal Information Processing Standards (FIPS) [153, 152, 147, 150, 148], published by the National Institute of Standards and Technology (NIST). However, many others exist, such as the various ANSI nancial security standards [16], the ISO extensions to the OSI Basic Reference Model for communications [8], and the CICA Computer Control Guidelines [42]. These fall into two broad categories: those that deal at the system level, and those that are focussed on subsystem components. We discuss only a small representative set of documents, but the discussion applies in general.

4.6.2.1 System Level Guidelines/Standards

Examples of system level guidelines include some of the DoD NCSC rainbow books (each guideline has a dierent color, and is often referred to by its color

33 only). These include the Trusted Computer Security Evaluation Criteria (TCSEC, or orange book) [67], the Trusted Network Interpretation of the TCSEC (TNI, or red book) [72], the yellow books (technical rationale and guidance for applying the TCSEC in speci c environments) [65, 69], and the Trusted Database Interpretation of the TCSEC [75]. Other examples include the DoD Industrial Security Manual for Safeguarding Classi ed Information [64], the CICA Computer Control Guidelines [42], the POSIX 1003.6 security standard eorts, and DEC's Digital Distributed System Security Architecture [88]. Although much of the risk management literature would also fall into this category, it is discussed separately, due to its importance. These can generally be viewed as serving any of three purposes [67]. They can 1) help the user clearly and concisely state their security requirements, 2) help the system developer design systems that match these requirements, and 3) help in evaluating what a given system actually provides. They can also play an important role in government system procurement. Such guidelines must necessarily address a wide range of issues and individual documents are often extremely detailed (e.g., [64, 72, 42]). Some representative examples of the scope of the issues include: security policy requirements (e.g., the military policy [67]); secure operating procedures (e.g., the handling of media storage or security clearances [64]); required security services or functionality (e.g., audit trails, user authentication, or labeled objects [67]); certi cation methods (e.g., requirements for formal proofs, or penetration testing [67]); assurance mechanisms (e.g., required design steps, or documentation [67]); and architectural models (e.g., communication protocol layers [8], subsystem relationships [72], or a comprehensive system reference architecture [88]). There is no doubt much to be learned about security in general from these publications. These represent the result of years of eort, and the expertise of people who have been dealing with system security for many years. They can provide a wealth of information when viewed in this perspective. However, these guidelines do not help in designing the mechanisms necessary to enforce the requirements they specify, nor do they help in insuring that they are not negated. For example, the TCSEC requirements for a class B2 system do specify design steps, such as providing a formal security policy model, and a descriptive top-level speci cation. For a class B3 system one must then demonstrate the correspondence between the two[67, 24]. However, our needs are much earlier in the design process. We are looking for a way to develop the top-level design speci cation in the rst place. Similarly, although a guideline may specify that \all operations be audited," this provides no help in seeing how to design this into the system, such that it can not be circumvented. This was exactly the problem with the simple example in Section 3.1.1.1, where the requirement was to \mediate all le accesses." For our needs, an even more fundamental drawback exists here. Assume that somehow we could design a security architecture tailored precisely to some particular guideline or standard. How do we know that this will meet our particular needs? How much of what is speci ed may be completely unnecessary? As Denning very recently pointed out [59], trust is not a property of a system, but rather an assessment made about a system based on the observers needs. If one designs a

34 system to be \trusted," this can be correct only for an observer who has the same de nition of trust. For example, the DoD Industrial Security Manual for Safeguarding Classi ed Information [64] is over 370 pages of detailed procedures, including those for the handling and physical storage of information and security clearances, and for dealing with subcontractors, visitors, and consultants, who may or may not have same need to know as the target organization. Although this can probably be utilized, how do we determine which elements of this are actually needed for a given application, or if some of our particular needs are not addressed by any of them? Essentially we need the design tools we were looking for. Furthermore, one of the fundamental tenets of good security design is to keep the security architecture as simple as possible. The more one puts in, the more dicult it is to analyze or even understand. Thus, simply applying some set of security mechanisms to a problem without knowing why they are needed is not in line with this goal. There not only may be redundant or useless mechanisms, but there may also be fundamental security requirements that have not been addressed at all. This is similar to the problems with the preventive design approach. As discussed earlier with policy models, it can be dicult to simply state what is meant by \security" for one's own purposes, let alone to try and map this to some set of guidelines which were developed for some other particular notion of security.

4.6.2.2 Subsystem Level Guidelines/Standards

Examples of subsystem level guidelines include the rest of the DoD NCSC rainbow books, such as the green book (guidelines on password management) [66], light blue book (considerations for personal computer security) [68], tan book (understanding audit in trusted systems) [70], lavender book (understanding trusted distribution of software, hardware, and rmware) [74], burgundy book (understanding design documentation) [71], and the blue-grey book (interpretations of the TCSEC for computer security subsystems) [73]. Other examples include the ANSI nancial security standards, such as those for message authentication in electronic funds transfer [7] or encryption key management [6], the Data Encryption Standard (DES) [148] and its speci ed modes of operation [151], and the OSI layered communication security services of ISO-7498 Part 2 [8]. Such guidelines and standards can certainly be valuable in any security design eort, in much the same way that research in speci c areas of security is valuable. They can provide a wealth of information on the individual security issues. However, they are narrowly focussed, just as speci c research areas are, and thus can not oer much assistance in the design process, especially of the type we are looking for. Even if particular application requirements match the standards very well (e.g., electronic funds transfer), the standards will leave many issues unaddressed. The standards exist to insure compatibility, and to provide a baseline level of assured security, but not to specify the actual design of the system. For example, the DES encryption algorithm is assumed to be breakable only by exhaustive key attack, and although a particular standard may insist that DES be employed, the resulting

35 system may be very insecure if the method of key management is faulty. Even if the key management mechanism was also speci ed, there are many other system aspects to consider, which can all have a large impact on security (as demonstrated earlier in the examples in Section 3.1.1). For example, in the URSA system we must understand which messages carrying what data at what points in the system require protection. The standard or guideline will specify only a very focussed subset of the entire system. A nonsecurity example can be seen in the OSI layered communications reference model (OSI Model) [5]. Such standardization is obviously important for intersystem compatibility and economic reasons, and as the lower layers have become better de ned, this can actually aid the protocol design process. However, many speci c application needs may dier widely from the services which the OSI Model speci es. In these cases, there is little reason to conform to the model, except at the external interfaces. For example, the original communication package for the URSA system [211] as well as the new communication system follow proprietary protocols except at the lowest level interfaces where they must communicate with external standardized modules. This is simply because the speci c needs were best met in this manner. In general, unless ones needs match the model services very closely, and unless the model provides substantial detail on the actual design of the required mechanisms, such a guideline will not aid signi cantly in the actual design process. It is good at indicating what must be done, but poor at indicating how to do it or the other issues that must be considered. Furthermore, as discussed in the previous section, it is not trivial to determine if the security needs for some subsystem are met by a particular guideline in the rst place.

4.7 Risk Management

The oldest method of designing secure systems of any type is what can be referred to as the risk management approach. This generally consists of the following steps: determining ones assets, assessing the possible threats against these, determining the vulnerabilities by which those threats can be carried out, assessing the risks associated with those threats and vulnerabilities, and nally, designing safeguards to address the important vulnerabilities, based on the relative risks and costs involved. There are many variants, such as [32, 187, 136, 38, 156, 205, 160, 41, 147, 150], discussed in much more detail in Chapters 5 and 6. Our use of the term \risk management" is a broad one. It covers any design method that follows this general approach, regardless of whether all steps are included, or of how quantitative or qualitative it is. The general requirement is that the design be driven by the determination of security vulnerabilities, in a systematic, cause-eect manner. This approach in some form has been used for thousands of years. Risk management is simply a fundamental part of life; from one's own life and health insurance policies, to home smoke detectors, to simple oppy backup of personal les, to o-site storage of organizational backup media, to full company contingency plans to handle catastrophes.

36 Although each of the security design approaches discussed in this chapter serves an important role for dierent needs, risk management techniques oered the only structured, predictable approach to secure system design which attacks the design in a controlled cause-eect manner. Certainly, nd-and- x is also cause-eect driven, but it lacks structure and predictability. Risk management appeared to allow security design to be approached in the same methodical way as other system design issues, only instead of starting with required functionality and producing modules that provide the functions, one starts with security aws and produces modules to address those aws; a very logical approach. Certainly, risk management can also be combined with other security design techniques such as formal approaches or guideline adherence, but for our needs it appeared to be the logical starting point. However, risk management is not a panacea. These techniques generally address the design issue from a much higher level of abstraction than we require. The major sources of information are targeted at system management or administration with the goal of developing a comprehensive organizational security program. As such, this encompasses much more than computer or software security, including natural or man-made disasters, the physical security of personnel, oces, and buildings, as well as developing contingency and disaster recovery plans. Consequently, very little of the risk management literature dealt with the issue of software design. As we will show in Chapter 6, applying risk management techniques to detailed software design proved extremely dicult, especially in a dynamic, distributed environment, where assets and threats are rapidly changing form and location. Even for the higher level approaches to system design for which risk management was developed, these techniques can be very complex and much of the literature is geared towards simplifying the problems encountered. There have been a number of attempts to extend or modify these techniques for software design purposes, such as [194, 165, 81, 80, 41]. Some of these oered no help, but those that were relevant are discussed later along with the more conventional approaches in Chapter 6 as we describe our attempts to employ risk management techniques in the design of the URSA security architecture, and then later as they relate to our proposed design approach.

4.8 Safe and Reliable Systems Design

Two closely related areas of research that should be mentioned are the design of safe and reliable (fault tolerant) systems. However, discussions of the relationship between these and the design of secure systems are almost nonexistent in the literature. Although this is a complex area, the fundamental dierence between safe and reliable systems can be expressed as follows [35]. Reliability requirements try to insure that the system correctly performs its intended function under some set of conditions (i.e., being failure-free or fault-tolerant), whereas safety requirements try to insure that conditions leading to an accident (hazards) do not occur (i.e., being accident-free), regardless of the whether the system was performing its intended function or not. Accidents thus involve notions of risk, loss, or harm, whereas failures simply mean the system did not perform according to speci cation. A

37 software fault may or may not cause an accident. Likewise, the occurrence of an accident may or may not have been caused by a software failure. In general, these both must deal with risks, threats, negative requirements, and the diculty of assessing the level of safety/reliability/security achieved [130]. However, the issue of safe software design seems more closely aligned with our needs since it encompasses the notion of undesirable events or accidents. For example, software that controls a building heating system might be subject to the safety policy that the building not catch on re. If someone turns the thermostat to maximum, enforcement of the no- re safety policy might require the system to stop at 100 degrees, even though reliability requirements would have it faithfully follow the thermostat (its design speci cation). In general, the design of safe systems appears more dicult than secure systems. It appears much easier to de ne \wrong" behavior in a secure system. In the security domain we usually simplify this task by looking only at things such as unauthorized release or modi cation of information. These are what are mapped into the policy requirements (e.g., if information moves from object A to object B, then the classi cation of B must be at least as great as that of A). This is but a tiny subset of all possible hazards that can occur in some systems. For example, while testing a B-1A bomber, the close weapons bay door button was punched while the door was blocked open for repair work. Two hours later when the block was removed, the door unexpectedly closed since no time limit on command completion had been programmed [130]. These kinds of issues introduce complexity far above that in the security design domain, and are not easily mapped into a mathematical policy model for safe systems. However, just like security compromises, accidents are usually caused by multiple factors where the relative contribution of each is not clear [130]. As such, fault tree analysis is generally employed in the design process along with risk management techniques. In fact, we employ fault tree analysis techniques in our design methodology which is also based on a risk management foundation, presented later in Chapter 7. However, fault trees suer from the same complexity explosion as risk management techniques when applied to our target class of system, which is why the rest of our design methodology is required. This complexity problem is examined at length as the methodology is described. In the end it appears that safe system design tools are even more lacking than those for security design, if only because of the more complex task. Fault-tolerant systems may not appear similar to secure systems, but there is a correspondence. If one could completely specify the \correct" secure behavior of a system, then any deviation from that speci cation would be a fault, and would signify a security violation. This is promising since the design of fault-tolerant systems appears more tractable than safe systems. Rather than trying to de ne all forms of hazardous behavior, one is only concerned with determining if the system correctly implements its design speci cation. Although this is still nontrivial, it is certainly more well de ned. In hardware fault tolerance, one is usually asking if a component worked correctly and invariably handling it through redundancy [170] . This can be done by comparing to some known component(s), or its known behavior. A similar approach

38 has been proposed for computer virus control [119], though it is extremely dicult to specify known \correct" behavior of an entire system. A more fundamental problem is that in a fault tolerant system one does not expect the check mechanisms to be subverted. Even if they can fail, redundancy can prevent malfunction. In security, however, if the one mechanism can be broken then its identical partner can also be broken. The critical dierence is malicious attack versus random failure. As for software fault tolerance, according to [169] no major body of theory yet exists. Of course, fault tolerant software systems are still successfully designed, employing techniques such as fault avoidance (since many software faults are due to design errors) and redundancy techniques analogous to replicated hardware. The latter are dicult to apply due to the intrinsic complexity of software, and again suer from the malicious versus random failure dierence. Thus, although fault-tolerant systems can be easier to design than safe systems, even these do not employ tools of the type we need. It appeared that both safe and reliable system design techniques were lacking the same basic design tools, though this may be an excellent area for future research. There may be many techniques that are applicable; as stated earlier, fault tree analysis techniques are employed in our proposed methodology. However, the risk management direction was a logical rst choice, since it is the oldest and the most widely discussed in the general security literature.

4.9 Conventional Software Development

Conventional software development methodologies, speci cation techniques, and modeling eorts are certainly applicable to the design of secure systems. They must be if one is designing a software system. For example, our methodology employs fundamental ideas such as object-oriented design, functional decomposition, and elements of both top-down and bottom-up approaches. However, for the security design, we need tools to augment these approaches [85]. The reason is simply the negative requirements speci cation issue. As was pointed out in Chapter 3, this made it very dicult to move cleanly from requirements to the design, and to understand the level of security achieved in the manner that conventional development approaches provide. Some of the literature has begun to address this issue. One of the earliest to point out the need to integrate security design requirements with conventional software development methods was [18], followed later by [140, 85]. Other recent examples include integrating the TCSEC security requirements [67] with the software development process of DOD-STD-2167A (a waterfall model) [25, 24, 54], integrating the TCSEC with the spiral development model of Boehm [34] [139], and including security in rapid prototyping environments [10]. However, these eorts are concerned mostly with tting the security speci cation and testing requirements into the conventional design methodology. They do not deal with the actual security design problems that we are concerned with in this research.

39

4.10 Experience

One important design method is to simply utilize experience. Having designed one secure system will certainly help in designing the next, and approaches that worked in previous systems can be utilized as appropriate. This is how many secure systems are designed, often with very good results, and this can play an important role in any of the previous methods. However, for the small system developer, such previous experience is probably not available. In any case, this is orthogonal to the issues of developing the design in a structured manner, assessing the level of security achieved, and providing documentation for future designers.

4.11 Recent Work Supporting Our Position

Only recently have others begun to point out the lack of adequate design tools for secure systems. Some of this literature is brie y mentioned in this section. The work relevant to our needs will also be discussed later in context, as it relates to other existing design methods or to our proposed design approach. In [120], the author points out that most mathematical analysis of secure systems is performed after-the-fact, applied to an already existing design. One of the rst to point out the need for mathematical tools to aid the designer, not just to certify an existing design, was [94]. He maintains that the primary value of mathematical analysis should be to guide the process of producing the system, not just for certi cation after the system is built, and that the current practice of engineering digital systems is far removed from this goal. Although these ideas enforce our beliefs, the approach is not applicable to our needs. He proposes a mathematical function that accurately and completely describes the physical behavior of a digital device, but only discusses this at a high-level view. Clearly, this is a formidable task for a complex software system, and even if possible, presupposes a design for the device to start with. The desire for a security design methodology that provides tools for simulation analysis and feedback during the design of policy models was expressed in [103]. They provide for simulation during the development process, allowing evaluation of model features and system behavior based on a given model. However, as discussed earlier in Section 4.5.2, model development is a necessary but far from sucient aspect of our needs. We are looking for tools to aid in the system design process. In [166], the need for a methodology for designing secure computer networks is discussed. This is directly in line with our belief that such tools are needed at the application development level. Although this is a positive step in the right direction, it focusses on the network only (not the entire distributed system), and does not provide the level of detail and control that we require. It assumes a much higher level view. Further, as discussed in [95], it does not really deal with the design process itself. The continuing need for a coherent design methodology for network security was pointed out in [95], in which another methodology for network security design is presented. This is another positive eort but again is focussed only on the network aspects. Further, although the problem is addressed in more detail than the previous approach [166], the required degree of detail and control is still missing.

40 One of our fundamental goals, the need to convert an existing application to a secure version, was pointed out in [86]. They discuss the \rescue" of the investment in an installed software base as a neglected area in secure systems. This, however, is a very high-level paper, narrowly focussed on the use of a trusted computing base to meet particular TCSEC requirements, and thus does not apply to our needs. Lastly, in [90], the author points out that even though functional testing is the most common technique for assurance of a secure system, little research has been done in the discovery of testing methods speci cally tailored for security. From our research, it is apparent that a similar situation exists for design methods, especially for application developers with limited resources.

4.12 Summary

Although a wealth of information exists on many dierent aspects of computer security, the fundamental issue of how a secure system is designed and maintained in a systematic and predictable manner has been largely ignored. As such, there has been no categorization or classi cation of current secure system design approaches from this perspective. We undertook a detailed investigation into how secure systems of all types were being designed, and were able to group the various approaches into a set of broad categories which could then be analyzed somewhat independently. These are nd and x, preventive design, formal methods, guideline adherence, risk management, safe and reliable systems design, conventional software development, and experience. The design process for many systems is probably based on a combination of experience, nd-and- x, and preventive design. Although formal approaches and guideline adherence have played a major role in research, neither of these approaches addresses the design methodology itself. In the available literature on most systems, the after-the-fact analysis of the security architecture, such as how they implemented the functions required by a policy model or guideline, or how they formally speci ed and veri ed the design, is of much more interest than any discussion of how they arrived at the design in the rst place. Of course, much can be learned from such analysis, such as the types of security issues they faced, viable solutions to common problems, and the diculties they encountered. However, this is all ancillary to our needs. The issues in the design of safe and reliable systems would appear closely related to those of secure system design. However, discussions of this relationship are almost nonexistent, and current approaches do not apply well to our needs, although this may be an excellent area for future research. In fact, we employ fault-tree analysis in the design methodology. Conventional software development methodologies are clearly of importance, and we do employ common elements in the design methodology. However, although some eort has gone into understanding how to integrate security speci cation and testing requirements into this process, it is from the direction of incorporating the security design, rather then providing tools to aid in the design process. There are no applicable tools to deal with the fundamental negative requirements speci cation issue.

41 The only approach that addressed the security design problem in a well structured manner aligned with our needs was that of risk management. Because it is also the oldest method of designing secure systems, it was a logical starting point. However, very little of the risk management literature deals with software design, and as will be seen in the following chapters, applying this to complex software systems results in many diculties which must be overcome.

CHAPTER 5 RISK MANAGEMENT TECHNIQUES As discussed in the previous chapter, the classical risk management approach was the security design method most applicable to our needs, and a logical starting point. In this chapter the foundations of risk management are discussed in more detail, and necessary terminology is de ned. In the following chapter, our attempts to employ these techniques are discussed.

5.1 De nition of Terms

The general risk management methodology was brie y described in Section 4.7. However, terminology is not well standardized in this area, and can be very confusing [32], such as the distinctions between threats, risks, perils, and hazards. Even risk management can be de ned in numerous ways. However, \security" itself is often an ill-de ned term [160], and even those who are securing systems will often not have a clear de nition of security. The terminology problem, although not critical in high-level discussions, complicates matters greatly when attempting to apply these techniques, and compare various approaches, as we will see in the next chapter. In order to provide a foundation for the remaining sections, the terminology is now discussed in more depth. Where possible, terms are precisely de ned as we will use them, but in other cases (e.g., where we will later explore dierent approaches which each de ne a term dierently) only a general guideline is provided.

5.1.1 Risk Management

Our de nition of the risk management process can be taken almost directly from Fites [81] as, a scienti c, systematic approach to the analysis of security risks and the introduction of cost eective safeguards to reduce these risks. The only dierence in [81] is that he insists on a \quantitative analysis," while we do not. In fact, we essentially insist on a nonquantitative analysis, the reasons for which are discussed in Sections 5.1.5 and 5.3. While there are many dierent approaches, such as [32, 187, 136, 38, 156, 205, 160, 41, 147, 150], the risk management technique generally consists of most of the following steps. Each is discussed in more detail in the following sections.

Determine the assets, those entities that require protection.

43 Determine the possible threats against these assets. Determine the vulnerabilities by which those threats can be carried out. Assess the risks associated with each vulnerability, and rank in terms of importance. Design safeguards to address the important vulnerabilities, based on the relative risks and costs involved.

5.1.2 Determine Assets

An asset is any element of the system that requires protection; these are the elements the security policy pertains to. This can include terminals, networks, workstations, database les, system programs, application programs, user queries, user query results, user les, and users themselves. Not all elements of a system need be assets, only those that are security relevant.

5.1.3 Determine Possible Threats Against Assets

Threats are not nearly as well de ned as assets. In Webster's dictionary, a threat can either be an intent to in ict harm, or an entity that can in ict harm. Here, a threat can be de ned in many ways, such as anything that could adversely aect the enterprise or the assets [38], or a situation where something undesirable can occur [81], or an expression of intent or potential to in ict harm [80]. Some claim that threats can be broken down into the causal agent (e.g., natural hazards, intentional acts, unintentional acts) and the speci c mechanism by which this agent occurs (e.g., ood, arson, typing errors). For example, a disgruntled employee may pose a threat of unauthorized destruction of data. This threat may be carried out by any number of speci c mechanisms, such as deleting les, or blowing up the computer room. However, these are separate issues from the threat. A ood may pose the same threat, though by completely dierent means. These mechanisms by which the threat can occur are actually vulnerabilities, discussed in the next section. In the example above, regardless of the mechanism used, the threat result is the same: unauthorized destruction of data. These are also called outcomes by others [187]. The result or outcome is simply the way in which the threat can violate the security requirements. If this sounds confusing, rest assured that the literature can be much more so. Parker [160] points out that threats are really composed of ve elements, which all must be taken into account to assure that all signi cant threats have been identi ed: sources (e.g., employees, vendors, outsiders), motives (e.g., personal gain, incompetence), acts, (e.g., overt, covert, physical), results (e.g., disclosure, modi cation), and losses (e.g., monetary, privacy). Broder [38] makes the distinction between perils, such as re or ood, and hazards, which contribute to perils, such as a pile of oily rags or building in a ood plain. Norman breaks this down into proximate causes and root causes or perils, which are really more like Broder's hazards. Some authors do not speci cally address threats at all. For example, Fisher [80] discusses threats, but in the proposed design process, moves directly from assets to exposures

44 (i.e., vulnerabilities) on those assets. Finally, some authors use the term threat, to refer simply to threat results or outcomes (e.g., [173]). With the above in mind, the following rules apply to our terminology, though keeping these separations is sometimes dicult in practice: The term threat is reserved to mean only the causal agent, as is often done [38, 160, 205]. Vulnerabilities are the way in which the threat can be realized. Threat results or outcomes are the way in which the occurrence of the threat violates the security requirements.

5.1.4 Determine Possible Vulnerabilities

A vulnerability is a way in which a threat (or threat outcome) can occur on an asset; a weakness in the system that allows the threat to occur. In the previous disgruntled employee example, two vulnerabilities could be 1) not disallowing login after ring, and 2) inadequate physical protection of the computer room. In another example, two vulnerabilities that can allow a user to unintentionally modify a data le are 1) inadequate control on le protections, or 2) the user runs a Trojan horse program. These are also called exposures by others [80]. Some may refer to calculating exposures in terms of dollar values [81], but we will reserve exposure to be equivalent to vulnerability. The distinction among threats, vulnerabilities, and threat outcomes can be confusing, especially when trying to compare dierent methodologies.

5.1.5 Rank Vulnerabilities in Terms of Risk

The goal of this step is to determine the expected severity of the vulnerabilities (how much risk they represent), through some form of risk assessment or risk analysis. The reason is to prioritize their treatment and expend resources in a cost eective manner, better understanding the cost-bene t tradeos involved. From Webster's dictionary, as with threats, a risk can be either an intent or an entity, although now it also contains a value, usually a probability of occurrence. Determination of such risk can be done in a number of ways, from quantitative to qualitative. Fisher [80] actually de nes risk assessment to be the nonquantitative process, whereas risk analysis implies the assignment of quantitative cost/loss values, though this is not standard terminology. In some strict de nitions, risk means only the expected probability of an undesirable event occurring. Then, multiplying this by the expected loss per event yields the expected loss per unit time (which is the true measure of the importance of the vulnerability). A more holistic view [131] de nes risk as a function of three things, 1) the likelihood of a hazard (threat) occurring, 2) the likelihood that this will lead to an accident (vulnerability, or undesirable event), and 3) the worst possible potential loss associated with such an event. We will not require this degree of detail, since we will not be dealing with quantitative risk analysis in this dissertation. This is discussed in detail in Section

45 5.3. Unless indicated otherwise, risk analysis and risk assessment will mean the same thing; simply the qualitative evaluation of the risks. Further, the speci c way in which this evaluation is carried out is not of concern to us.

5.1.6 Design Safeguards to Address Vulnerabilities

A safeguard is a mechanism which eliminates or reduces the possibility of a vulnerability. This can include operational policies, physical guards, software, and hardware. For example, the le protection vulnerability above may be reduced by running a background program that constantly monitors and corrects le protections. These are also called controls by others [80]. Note that safeguards are not always designed just to address known vulnerabilities. For example, they may be deterrents such as audit trails, or simple detectors of anomalous behavior. The design of safeguards will draw heavily on related work in secure systems of all types, from research to commercial, and from the technological to the operational aspects. Some of these were mentioned in Section 2.2 and Chapter 4. This not only provides a foundation of known security techniques, but has also provided a large body of generally accepted security design principles which one should attempt to follow, as discussed in Appendix F. It is important to realize that safeguards are assets just like any other system resource. The dierence is that they appear after the fact, and continue to do so as the design progresses. They must be integrated into the analysis and protected, just as any other asset.

5.2 A Note on Practical Approaches

Although the above steps capture the spirit of risk management, the approaches in practice can dier in many ways, usually due to practical considerations of applying this to real systems. In the next chapter, a case study of our attempts to utilize these techniques, the practical considerations and need for dierent approaches will be more apparent. Following are some examples of the issues. Listing all assets can be tedious, especially as the level of detail increases. Not only is the number large, but any given asset can exist in dierent forms and locations during its lifetime [160]. This is further complicated in a dynamic, distributed system. Taking this into consideration in a manageable way is not easy, especially when combined with the previously discussed threat breakdown into sources, motives, acts, results, and losses. In dealing with threats, two major issues are the required level of detail and quanti cation. For example, some just refer to the results of threats [173], while others refer to the entire spectrum of threat classi cation [160], while others use a hierarchical plan of attack [156]. Level of quanti cation depends on the method of risk assessment, discussed later in Section 5.3. The number of possible vulnerabilities in a system can be extremely large, virtually in nite, since this constitutes all possible ways in which security can be violated. Thus, all methods employ mechanisms to keep this tractable, such as categorization [149], exhaustive checklists [146], or by making vulnerabilities look more like threat results rather than a means of achieving those results.

46

5.3 The Issue of Risk Analysis

Risk analysis can mean anything from highly structured methods involving probability and dollar value of assets, to plans based on a few observations and interviews [160]. One of the better known quantitative methods was developed by IBM in the 70s [160], and forms the foundation of NIST FIPS Publication 65 [150]. This is based on order of magnitude calculations (rather than extreme accuracy) [206] of expected frequency of occurrence of events, and expected loss per event. The product is the expected loss per year. However, the desirability and validity of quantitative risk analysis is a very controversial point, with many dierent problems and approaches suggested [187, 81, 186, 136, 160, 206, 205, 156]. These issues are beyond the scope of this research, but it is worth noting some key points. The assignment of arbitrary values (e.g., when solid data are not available) at various points in quantitative analysis can produce questionable results; the simple resulting gures do not re ect the trust one has in all the elements which went into them. The LAVA system at Los Alamos [187] provides a very rigorous approach to dealing with this problem, and successful use of probabilistic risk assessment techniques in the nuclear power industry is discussed in [39]. However, it does seem that the more productive eorts have been based on general, macro analysis, rather than on detailed, highly quantitative micro analysis [136]. Risk analysis can be very time consuming; it is important to not spend more assessing the risks than would be lost if they were simply accepted [80]. As an example of the diculty of assessing risk, one traditional rule-of-thumb that is often stated is to simply make the cost of a successful attack on an asset exceed the value of that asset to the perpetrator. However, this may be neither practical nor adequate [160]. First, determining its value to the perpetrator is generally not possible, and second, this assumes the attacker is rational, knowledgeable, and uses good judgement, which is often not the case [160]. A perfect example is the malicious hacker, who may be willing to spend whatever amount of time is required, simply for the goal of succeeding. As stated in [197, p.80], \risk assessment is, at best, an art, requiring lots of experience combined with trial and error," with which we certainly agree. We will not include any form of quantitative risk analysis in this research. There are three reasons why this decision is in line with our fundamental goals. First, such analysis is purely a method of ranking vulnerabilities. The speci c way in which this is performed, we will see, is orthogonal to the rest of the design approach, and thus to the main goals of this research. Changing the method used to rank vulnerabilities would not have a direct impact on the general methodology. Thus, there was no desire to complicate this issue initially, especially since the quantitative approaches are still controversial. Second, even if this were not the case, the relative value of assets and danger of threats is extremely dependent on the speci c end-user needs and environment, not just the application system (examples of this were given in Section 3.2). Such an analysis can be meaningful only in that context [205, 38, 165]. This has no direct relevance to our search for a general design method. Third, the vulnerabilities which must be safeguarded will have a large impact on the resulting security architecture. Our goal, however, was for an approach to

47 investigate the range of possibilities, not to establish any one particular design. We were looking for a design approach, not a design for a speci c situation. Any attempt to rank risks would seem only to obscure our primary goal.

CHAPTER 6 ATTEMPTS TO EMPLOY RISK MANAGEMENT In this chapter we present a detailed analysis of our attempts to employ risk management and other related techniques in the design of a security architecture for the URSA system. For background on the URSA system and information retrieval, an introduction was given in Section 2.1.2, and Appendix A provides more detail on the URSA implementation and functionality. The reader wishing to quickly get to the design methodology should read this introduction and the summary of this chapter in Section 6.12. Although we had not read of risk management techniques directly applied to complex software design, the literature generally indicated this was possible. As discussed earlier, there were no suitable alternatives and this was a logical direction in which to proceed. We believed that even if this did not work out, it was necessary to fully understand why in order to better understand where to go next. Our purpose in this chapter is twofold. The rst goal is to demonstrate the fundamental limitations of these approaches when applied to distributed system security design, exemplify the problems encountered, and gain insight into the types of tools that might alleviate those problems. From this one can better understand the reasons behind the proposed design approach discussed later in Chapters 7, 8, and 9. The second goal is to record this information for others investigating the security design problem, allowing them to build on our experiences. By providing them with sucient detail, they may be able to see alternative approaches, or discover mistakes in our eorts, while not having to reinvent the same wheels we have already stumbled over. To achieve this goal, it was necessary to provide extensive detail in this chapter. It is necessary to present a case study since it is only when one applies these techniques to a real system that their limitations become visible. However, as our actual eorts to do this spanned multiple years, many details had to be omitted. Some claims made early in the following sections may appear to lack substantiation. Reading further should provide the necessary details along with examples to support those claims. In the end, if one is still not convinced of the diculty or complexity encountered, continuing on through the design methodology chapters should provide the necessary framework, after which the claims should be much better substantiated. Although this analysis is performed entirely within the context of the URSA system, it should be apparent that the results are applicable to any similar class

49 of distributed application. It is not the details of the URSA system, but rather its fundamental architecture and the fundamental design goals which cause the diculties. As stated earlier, for the reader wishing to get to the design methodology details, this chapter can be skipped by reading only the summary in Section 6.12 without loss of continuity. However, the design methodology discussion will sometimes allude to diculties exempli ed here. These chapters actually reinforce each other. Although this chapter demonstrates the problems encountered during the design process, it can do so only in a limited context. The design methodology chapters will explore the design process in much more detail, after which the reader should be acutely !aware of the issues involved.

6.1 Starting to List Assets

The rst step is to list the assets. Although these only include security relevant entities, the only safe approach is to start with the set of all possible assets, then carve that down. The main reason is that one can later see that a given entity was excluded by desire, and not simply forgotten. Further, if one has to later redetermine assets for the same system (e.g., for new security needs) the full set of assets is already there as a starting point. In addition, this can always be modi ed over time, as the system evolves, even if new items are not security relevant at the time they are added. This same principle is commonly followed in threat and vulnerability assessment as well, listing all possible ones before dismissing any as not relevant.

6.1.1 Initial Approach

We began by trying to make a comprehensive list of all possible assets for a generic URSA system. The concept appeared straight forward, with high-level categorizations found in much of the risk management literature. Such categories include people, facilities, supplies, hardware, systems programs, application programs, data information, etc. (e.g., [160]). The goal seemed to be a list that was sucient, not redundant, and not too long [81]. The initial list was categorized hierarchically, with the top-level divisions similar to those of [156], which included: Site personnel Development personnel Facilities support Hardware System programs Application programs Data / Information

50

Services

These were broken down into subcategories, such as the following: Site personnel: users, system administrators, management, software service, hardware service, facilities service, etc. Hardware: disk/tape/ oppy drives, printers, nodes, network (cable, I/F hardware), etc. Systems programs: lesystem, process/memory management, login system, communications, command shell, display system, etc.

Although there are numerous published detailed checklists for this purpose (e.g., [38, 205, 26, 156, 160, 41]), we did not use them initially. The goal was to see how far we could get by rst brainstorming, without biasing our eorts, and then later to look at the published lists for guidance.

6.1.2 Diculties with Listing Assets

Although it was relatively simple to list assets at the above level of abstraction, it became apparent that we needed a much ner grain of detail in order to design a secure software system from this. For example, \information" is listed as an asset, but at that level of detail what could we do with it? In order to understand the threats and vulnerabilities against it and how to counter them, we would need to know at least what it was and where it resided. For example, consider an application program. It can reside on CD before installation, on disk after installation, in memory when running, or on tape due to le saves. Those tapes can be on or o site, or in transit between. When running in memory, it can simultaneously be in pages on the local disk, or in the case of a remote paging server, as pages on the network and on the remote disk and in the remote node memory. The program is also partially within any run-time data les it may use. It can exist as a single stand-alone executable, or spread among dynamically bound shared library les, which can be both loaded and in backup storage. There is also the distinction between the executable binary version, the source code version, and any intermediate binaries. The same is true for operating system programs. These issues are critical in assessing the threats and vulnerabilities of the application program. In addition to such location information, it is equally important to know the signi cance of the information. For example, the following three pieces of information each have very dierent security requirements. A cryptographic key may have to be hidden from all users. The response to a user query will have to be visible to the user, but hidden from everyone else. The response to a date-time command may not be considered security relevant at all, and may be visible to the world.

51 All of these issues must be considered when trying to assess the threats and vulnerabilities of each asset. Of course, we may have to worry about only some of them, but without rst describing all possible assets how can we know that the analysis is complete? If we just pick what appears to be security relevant at the time, we are doing little better than the nd and x design approach. One straightforward approach that we tried was to simply list each of the ways in which the dierent assets can exist as attributes under each. However, this list became very dicult to structure and maintain. For example, under application program we could list all of the issues discussed above. However, the same would apply to system program, and much of it would also apply to the data/information category. Data les would certainly be subject to much of the same analysis, though with their own additional issues. This quickly became very redundant, and it was dicult to keep the common attributes in many of the categories in sync. For example, we might undertake a detailed re nement of the issues involved with a particular asset. The problem then would be to remember all the other assets to which this applies, and then to modify their attributes accordingly. Invariably some will be missed. Further, we eventually have to break down these high-level assets into more detailed elements. For example, application programs must be re ned into the dierent types, such as index, query, or search machine, since each of these may have very dierent threats, vulnerabilities, and resulting security requirements. Each may also reside in dierent environments and thus require dierent underlying attributes.

6.1.3 A Relational Assets Structure

It should be apparent that an asset can be many things at many dierent times and places. This was previously pointed out by Parker, stating that assets have both a form and a location (though only done at a very high level of abstraction) [160]. He speci ed each asset as an fasset, form, locationg 3-tuple, which we felt was a direction worth pursuing. For simple assets, especially static entities such as hardware devices or le cabinets, breaking them down individually into this categorization is reasonable. However, to avoid diculties similar to those exempli ed in the previous section, a more structured approach was required for our purposes. It appeared that a relational approach would help to avoid the increasing replication and redundancy in the growing lists. In addition, we attempted to employ a loose form of inheritance where entities could be viewed as a-form-of other entities, each de ned in terms of more fundamental ones. An example of this is shown in Figure 6.1 (a trailing \*" means the item is further described elsewhere, no trailing \*" indicates an atom). A more complete (though not comprehensive) version of this for the URSA system is given in Appendix C. This was not an exact representation. For example, the a-form-of operator implied some inheritance of attributes of the target entity, but we did not specify exactly what this meant. This was an initial attempt to try and capture more meaning in a more structured manner.

52 -----------------------------------program a-form-of information* forms: sources intermediate code executables before/during/after installation -----------------------------------information forms: on data_storage_media* in processor_memory_contents* on network in transit -----------------------------------data_storage_media forms: disks, tapes, floppies paper human memory locations: on data_storage_device* in physical storage on site off site in transit to/from device -----------------------------------data_storage_device forms: disk, tape, floppy drives paper filing mechanisms humans -----------------------------------processor_memory_contents forms: in processor RAM pages on disk, pages on network locations: wherever these exist... -----------------------------------and so on...

Figure 6.1. Example of Assets Explosion and Hierarchy

53 When carried out for the whole system this quickly grows, with much interconnectivity. For example, the index is a-form-of program, but perhaps with its own speci c peculiarities (e.g., some of the information it contains may have to be described in more speci c terms than data contained in other programs). Hardware is an asset, and will include CPU boards, memory boards, interface boards, interface cables, and peripherals. Peripherals would include disk drives, tape drives, printers, keyboard, user display, and the network. A disk drive would consist of the drive hardware and possibly removable media. It was not always easy to draw the distinction between form and location. For example, with data storage media, it was straight forward, with forms being disks, tapes, oppies, etc., and locations being in-device, on-site-storage, o-site-storage, etc. However, with processor memory contents, it seems to change form and location simultaneously, such as when in processor RAM, in page packets on the net, or in page les on disk. We did not try and be exact, but instead viewed this as an experiment to see where it would lead.

6.1.4 Employing Checklists to Add Structure

We had hoped that the published checklists would be valuable at this point for two reasons. They would help ll in any gaps, and might help provide a structure to this unwieldy mass of information. The checklists were scattered throughout the literature (e.g., [160, 38, 205, 26, 41, 156]), in many dierent forms, applied to dierent systems and requirements, and with much overlap among them. In order to be useful as a design tool we had to generate our own lists from the integration of many of these along with our own original lists. This was a valuable experience. It enabled us to see broad categorizations, combine things which initially did not seem redundant, and clarify many issues by seeing many dierent perspectives. However, most of these checklists pertained to a much higher level of abstraction than we required. This was generally because they dealt with a wide spectrum of security issues, such as res and oods, building emergency management, kidnap and ransom contingency planning, physical grounds control, property control, and personnel screening (e.g., [168, 186, 38, 146]). Of course, this was to be expected, since most of the risk management literature was focussed on the system management level. The few that did focus more on technical issues, seemed to skirt the problem of the \details" [81, 41, 166]. For example, Campbell suggests a modular, hierarchical approach to the problem [41], but only with examples that worked at a very highlevel view, and that oered little help with the issues we have been discussing.

6.1.5 Fundamental Problem with Assets Lists

The assets listing was becoming dicult to understand. It was clear that assetslist approaches may be ne for static entities, for example, if what one means by \securing" a terminal or computer system is protecting it from theft or destruction. However, they did not apply well to our needs. How the individual assets related back to the application system was simply not visible anymore. Looking at the structured assets list in Figure 6.1, or the more detailed one in Appendix C, we

54 see very little information about the original application or its dynamic operation. From within the mass of details, we could no longer see the big picture. Given this, there seemed to be no way to determine what was security relevant, and how it was, once the time came to apply threats and vulnerabilities to these assets in order to design safeguards. For example, it would be nearly impossible to apply our assets listings to the task of suggesting which les on which disks are vulnerable, which network links or which data elements on those links need protection, or where user authentication is required. The reason is simply that we had lost all correspondence between the assets and the application. Some better structure or organization was going to be required.

6.2 Threats and Vulnerability Lists

In parallel with the above eorts, we were also trying to develop threat and vulnerability lists. These proved even more dicult than the assets.

6.2.1 Ill-de ned, Less Tangible than Assets

The lack of precise terminology, previously discussed in Chapter 5, made it dicult to clearly understand our goals in this eort. Attempting to list all possible vulnerabilities in the system was of course impossible, so it was very important to understand just what we were trying to do. Speci cally, the real distinction among threats, vulnerabilities, and possible results was very unclear. As with assets, it was easy to discuss these at a high level of abstraction, as was done in de ning the terms in Chapter 5. However, applying them in practice was not easy. For example, consider a scenario discussed earlier. We can view a disgruntled employee as posing a threat of unauthorized data destruction. This may be achieved by exploiting the vulnerabilities of 1) not disallowing login after ring, and 2) inadequate physical protection of the computer room. This allows the employee to achieve the results of either deleting les or blowing up the computer room. Even in this simple example the threat looks almost identical to the results of the vulnerability, namely the unauthorized destruction of data. However, other situations are dierent. For example, a threat may be a Trojan horse attack, which may be realized by exploiting the vulnerability of inadequate le protections, which may result in the unauthorized release, modi cation, or destruction of information. Here, the threat is quite dierent from the results of the attack. The above issues exemplify a fundamental problem. Threats and vulnerabilities proved to be far less tangible than assets and correspondingly much more dicult to de ne. We could easily envision assets such as a node, disk drive, network, or even a program in both time and space. However, visualizing something like a Trojan horse attack was very dierent. Such an attack spans a range of time, from when the program was introduced until it is run, as well as a spatial range, since the program or data housing it may move between the time it was introduced and the time it is run. Certainly, we could simply write this as one entry in the threat list called Trojan horse attack (as some of the literature would suggest), but we had

55 no idea how to record and later analyze all of the nuances associated with it. For example, how would we know which nodes were vulnerable to its being installed?

6.2.2 Threats Lists

Given the more nebulous nature of threats and vulnerabilities, we relied much more heavily on published information such as checklists than we had for assets [136, 144, 26, 160, 134, 41, 156, 12]. As with assets, this information was scattered throughout the literature in dierent forms and for dierent purposes. However, unlike with assets, the dierent sources of information were often created for dierent de nitions of the terms threat and vulnerability. This further complicated the matter. Parker [160] points out that threats are really composed of ve elements, which all must be taken into account to assure that all signi cant threats have been identi ed. These are: 1. Sources, such as employees, vendors, outsiders, or natural forces. 2. Motives, such as personal/business gain, irrational behavior, or incompetence. 3. Acts, such as overt, covert, physical, single/multiple event, or real-time. 4. Results, such as disclosure, modi cation, destruction, or use of services. 5. Losses, such as monetary, health/life, denial of access/use, or privacy. On the other hand, Campbell [41] suggests there are only two elements to a threat, the threat agent and the penetration technique (like Parker's source and act). Fisher includes the concept of motive [80]. We initially disregarded the elements of motive, results, and losses, in an attempt to simplify things. We would take results into consideration later, but motive seemed inconsequential at this point, and losses seemed more important in a quantitative analysis, which we were not undertaking. Our view of threats would simply be sources and acts. Categorizing the threat agent, or source, is somewhat straight forward (e.g., [160], and we generated a reasonably complete list of sources, integrating our own ideas with those of [26, 160, 41]. We rst categorized these by job positions or types of people (architects, programmers, users, soft/hardware service technician, facilities maintenance, etc.). An example of this is presented in Appendix D. Later, it became apparent that what people know is more important than their job function, since they may know much more than the job would indicate. Enumerating and categorizing the threat acts, or techniques, was a very dierent matter. They began to look like vulnerabilities, and the number of possibilities seemed endless. There were many references, such as [136, 144, 134], which discuss speci c attack methods, such as electromagnetic pickup (EMP), covert channels, spoo ng, scavenging, and Trojan horses. We had no idea how to actually use this kind of information in the design process. Each of these types of attacks could be realized by any number of means, and although the target may be a localized asset, such as

56 EMP of a particular user's CRT, the attack may exist over a period of both time and space, such as the Trojan horse attack discussed before. Further, the target itself may not even be localized. In the EMP of the CRT example, the actual target is the data displayed on the CRT, not the CRT itself. These data could be any number of \information" assets. As an example of a completely nonlocalized attack, if a login spoof program is run by a system administrator, the attacker may now have full system privileges to attack virtually anything. We could not see how to design a system in a systematic and predictable manner from such vague and complex interacting information. However, all indications in the risk management literature were to proceed in this direction. Thus, we integrated the threat techniques with our own ideas, and through reorganization, generalization, and some rede nition, produced the somewhat comprehensive threat technique list presented in Appendix E. Regardless of its value in the actual design method, it provides a useful frame of reference for how systems are attacked, and is valuable to review to keep a broad perspective on the issues.

6.2.3 Vulnerability Lists

The vulnerability list was the most dicult to deal with. Theoretically there are an in nite number. A rst pass at simplifying them looked more like threat results, such as modi cation or disclosure of data, or denial of service. The existing literature and checklists were not much help here. For example, in [149], vulnerabilities are categorized under 17 forms of attack results, with the vulnerabilities discussed only in very high-level, general terms. At this point, some assets even began to look like vulnerabilities. For example, a communications port is clearly an asset, since it is a way to enter the node which must be protected. It is thus also a vulnerability, precisely because it is a way to enter the node. We initially put the vulnerability issue on hold, to see what could be done with the current assets and threats lists alone.

6.3 The Information Explosion

The next step in the risk management process was to apply each threat in the threats list to each asset in the assets list. Considering M assets, N threats (and P ways of realizing those threats), the M x N x P product grows rapidly. Even if software tools could handle the bookkeeping aspects, there were some major issues to resolve. We were looking for a way to methodically design safeguards from this information. However, considering the vague threat descriptions (e.g., Trojan horse, EMP), the fact that many could be realized in so many dierent ways, and the detailed assets lists which were out of context of the original application, there was little to guide this process. We felt overwhelmed by vast amounts of unstructured information. For example, consider the threat of EMP from a CRT. Any of the displayed information is vulnerable. However, how do we know what this information is? We have no idea whether it needs to be secured or not. There was no mapping that tied the isolated assets back into the application from which they came. Further, even if we could design the safeguards, it was not clear how to record the designs so

57 they would be meaningful at a later time. Some concrete examples in the following sections will explicate these issues. We attempted to carry out the process of applying the threats to the assets by starting with a separate le for each asset/threat pair. In each le we started to list the various ways in which that threat could occur against that asset. Each of these was essentially a vulnerability. We could then design safeguards to address these. Although this may sound straightforward, as it certainly did in the risk management literature, it is very dicult for at least the following reasons. Consider the initial act of creating a le for each asset/threat pair. We must distinguish, for example, among dierent programs (assets) since each may have dierent security requirements. This implies a separate asset/threat le for each program. In addition, we will need a separate asset/threat le for each form and location in which each program can exist (e.g., pages on a network, pages in local memory, pages in remote memory, disk le on remote machine, etc.), since the threats against a program in memory on one node may be dierent from threats against its pages on the network, or threats against those pages on a remote node. Thus, an application with 10 programs, each of which can have 6 forms and locations (conservative), would initially have 60 program assets to consider. Each of these would then be paired with each possible threat against it to create the initial set of assets/threats les. However, consider speci cally the asset/threat les pertaining to a program as pages on a remote node. Many programs may have pages on the same node, each requiring their own set of asset/threat les for that node. In addition, this node itself will also have its own set of asset/threat les. Each of these may be slightly dierent. Some threats against the node may certainly be threats against all program pages that reside on it, but individual threats against one program may be dierent from threats against another, and some threats against the node may have no impact on the program pages residing on it. As one attempts to ll in each of these les with the possible vulnerabilities and safeguards that correspond to that asset/threat pair, there is a great deal of overlap. We either needed a way to link the contents of these les, or we had to manually replicate many vulnerabilities and safeguards across multiple les. Neither of these approaches was easy, since vulnerabilities and safeguards are not isolated entities that can be simply listed in one place. As we have seen, these can span a range of space and time. For example, visibility of messages on a network may be due to uncontrolled access to certain nodes on that network, or to nonsecure software on other nodes, a user authentication system may involve multiple nodes and the network, and a local le access control mechanism may need to communicate with a remote authentication server. As a result, we found ourselves entering verbose text descriptions of vulnerabilities and corresponding safeguards directly into the assets/threats les, and manually replicating many of these across multiple les, as best we could. Thus, isolating assets, threats, vulnerabilities, and safeguards into these individual les proved to be very dicult, even though much of the risk management literature tends to promote this very approach. Although this was marginally manageable for a small number of entities, attempting to do so for the whole system became very confusing. It was virtually

58 impossible to see how it all interrelates. For example, there is no way to know if what one does in one place has any eect in another, or if a vulnerability in one place is identical to one addressed previously somewhere else. Although it would be possible to use a database system to organize this information, that would not provide the missing information on how these entities and text descriptions interacted, nor would it help in tying the detailed assets back to the original system from which they came. The examples given in the next section, where we discuss the problems of designing safeguards, should also help clarify the above issues.

6.4 Safeguard Design

As we have seen, one usually can not discuss the eects of a threat or a safeguard in isolation. They usually have dependencies on other assets, threats, and safeguards. If nothing else, one may see that any one of a number of safeguards can solve a particular problem, but some of them depend on what is done elsewhere in the system. The following examples demonstrate these issues.

6.4.1 Trojan Horse Example

Assume we wish to address the problem of a Trojan horse program on a remote node (i.e., one under the remote user's control). More generally, how do we deal with the possibility of unauthorized modi cation of software on a remote node? In terms of risk management, the asset is a program, the threat is a Trojan horse, and the vulnerabilities are essentially any way one can achieve the result of unauthorized modi cation of the program. Following is a sample list of vulnerabilities associated with the possibility of unauthorized modi cation of software. It can be modi ed: During development. During storage before installation. During software installation on the node. By network access. By user of the node itself. By subverted software introduced into the node at some other time. By someone changing the disk on the node. By someone booting the node o of a subverted node. Any number of other ways we will not consider here.

Following are three (of many) possible scenarios for safeguard design that deal with this issue, reduced to very simpli ed descriptions. Each has dierent degrees of security against dierent forms of attack. These examples demonstrate two general issues. First, a safeguard against a particular threat may not simply be one thing in one place and time. Second, although the risk management approach suggests that

59 each vulnerability be addressed independently, that is often very dicult because the safeguards become so interdependent. The full implication of these examples is discussed in the next section. 1. Safeguard-1 Run a trusted operating system with secure hardware (e.g., a Gemini system [182, 181]) on the node. Secure the software installation on that node (i.e., make sure the right software in installed initially). RESULT: Protects against malicious attack by trusted user or outsider.

2. Safeguard-2 Disallow a local disk on the node, and physically insure that it can not be added. Employ a trusted node/network interface unit that allows communication only with secure backend nodes. Force the node to boot diskless from a secure partner node. Make sure no tools are available to node user (from secure backend nodes) that could be used to modify programs or data in local memory. RESULT: Protects against malicious attack by trusted user or outsider.

3. Safeguard-3 Allow a local disk on the node. Secure the software installation on that node. Employ a trusted node/network interface unit which allows communication only with secure backend nodes. Trust the application user to not modify programs or data in local memory or disk. RESULT: Protects against malicious attack by outsider only; trusted user can subvert by modifying/replacing programs in local memory or disk.

6.4.2 The Problem

The above safeguards are certainly guarding much more than simply the modi cation of a program. They are dealing with a wide range of threats such as network and operating system attacks to achieve this simple goal. Further, the individual elements can depend on each other. How one comes up with this type of analysis from the very simple asset/threat pair of program/subvert-program, was not at all clear. We developed these safeguard scenarios entirely from general experience and knowledge of the system, not from the risk management approach. However, we were looking for a way to develop

60 them in a structured way from the requirements; in this example, the requirement was the need to protect against unauthorized modi cation of software. More importantly, how could we even keep a record of these options? Trying to do this for each asset/threat combination creates a big growth problem. We were accumulating endless amounts of text, discussing all the possible ways of dealing with various issues, but with no way of methodically tying this all together. Further, we had no way of seeing if these discussions were even compatible. A safeguard in one place may obviate others, or may be inconsistent with others, but we could see no way of determining this in a structured manner. For example, if we initially assume a nonsecure network and an untrusted user node (i.e., it runs software we can not verify), with secure backend nodes, then we may wisely choose a user authentication system based on one-time-passwords, so they can not be reused even if recorded by an attacker. This can be done with challenge/response systems such as Security Dynamics SecurID one-time password generator. However, as the security design evolves, we may decide to secure the network for other reasons, and to trust the software at the remote node (as in the above example). With this environment, a simple password or pass-phrase user authentication system may be quite adequate, since we would now not expect passwords to be released. The one-time-password approach is now unnecessary. The safeguards that led to the secure network and trusted node, have obviated the earlier safeguard. How would we see this during the design process? It is equally important to make sure that changes do not break safeguards designed earlier. To demonstrate how this can occur, the previous example can simply be reversed. If we initially assumed a secure network and trusted the remote nodes, we may have speci ed a simple password system. If later we move to an untrusted network, or decide to not trust the remote node, how will we see that this has broken the user authentication system? Another example can be seen in the above safeguard #2 against Trojan horse attack. What if we later decided to install compilers on the secure backend node, because we see that it can not be accessed in an unauthorized manner, and the system administration sta will want them? How will we see that this violates the requirement for remote nodes, that no such tools be available to the user of the remote node? Also, as the design proceeds, these new safeguards are becoming part of an interdependent system. They may depend on one another in functional and timing ways. For example, an access control mechanism will probably depend on information from the user authentication mechanism. How is this shown? As before, all of these issues may be discussed in the text descriptions of the safeguards, but utilizing such unstructured information is dicult. For example, if we want to play what-if games, and later pull some safeguard, or replace it with another design, how do we see the eects this has on all the others? How do we know what did or did not depend on it? Lastly, the new safeguards must be protected; they become additional assets. However, they are not in the assets list, and are not a part of the tedious asset/threat pairing that was undertaken previously. Furthermore, they are not simple localized entities; they are procedures and modules, possibly spread across the system, making them harder to add to the assets list.

61 Virtually none of the above issues were discussed in the risk management literature (nor in most of the literature in any way targeted at secure system design). The examples used were usually so simple and straight forward that these kinds of issues never occurred.

6.5 Simplifying Assets and Threats Lists

We looked for ways to further simplify the assets and threats lists. A number of approaches existed in the risk management literature (e.g., [81, 156, 160, 41, 104]), and analysis of these demonstrated some fundamental similarities. First, assets were generally made hierarchical, starting with assets groups and moving toward the speci c elements. This was essentially what we had done initially, before moving to the relational/inheritance approach (Section 6.1.3) to handle the replication and redundancy problems. Perhaps a combination of both could be employed. Second, instead of dealing with speci c threats, many approaches would deal only with the results of threats (e.g., release of information). Then, as required, these could be expanded into their root causes, or perils [156]. This two-fold approach is shown in Figure 6.2. The complexity is still there, but the idea is to hide it until needed. Third, throughout the literature, the possible threat results fell into seven general categories. These are the unauthorized occurrence of any of the following: Disclosure of information. Removal of information. Corruption of information. Destruction of information. Denial of service. Interruption of service. Use of services.

Although one may also include as threat results, issues such as deception [138] or failure to act when one is supposed to [161], we believe these should be viewed only in terms of the target attack they cause. For example, a Trojan horse may deceive the user, but its goal is to steal the contents of a le or a password. The primary result of its action (why it was created) is unauthorized release of information, not the deception of a user. For our purposes (and for most others), these can be reduced to only three forms of result, since the remaining ones can be viewed as special cases of these three: 1) unauthorized release of information, 2) unauthorized modi cation of information, and 3) denial of service. Destruction is just extreme modi cation, removal is release and/or destruction, and interruption of service is just a form of denial of service. We ignore unauthorized

62 use of the system, since protection against release and modi cation of information makes such use rather worthless, unless one charges for services. Further, if it results in reduced services to others, then it comes under a denial of service attack anyhow. Lastly, we do not deal with denial of service attacks. This is a very complex issue (e.g., [3, 89]) and in many ways comes under the area of fault tolerant computing (e.g., [130]). Attempting to incorporate this into our policy would be far too big an undertaking at this time. Thus, we really only have two possible results to be concerned with: unauthorized release of information unauthorized modi cation of information

However, even after attempting to restructure assets into hierarchical groups, and discussing threats in terms of only these two results, the problem was still unmanageable. In order to deal with the design issues, these still had to be expanded, just as we saw before. We were simply accumulating notes and information on all manner of vulnerabilities and safeguard designs, without any methodical approach, and without any way to integrate the information. Our problem was not just the complexity of the lists, but the fact that we seemed to need that complexity in order to design anything useful. Hiding that complexity was not the solution, but we also did not see a way of managing that complexity.

6.6 Assessing the Problems at This Point 6.6.1 Overview

It was becoming clear that the straight forward risk management approach as discussed in much of the literature was not working well for our needs. The complexity was growing due to the information explosion of assets x threats x vulnerabilities x safeguards. Even if one could record all these details, and employ software tools to help manage the complexity, there did not seem to be any way to comprehend what was going on while trying to design a system around all of this. For example, how could we see the interdependency of safeguards, or deal with the diculty of designing safeguards from the nebulous vulnerability descriptions. Further, it was dicult to see the big picture of the URSA system from within the detailed assets list, and thus we had no good idea of what we were really trying to protect (e.g., what data are displayed on the CRT, or what data are on a particular disk or network, and when?). Some authors did indicate that the textbook approach to this problem was dicult to achieve for real situations without a large expenditure of resources. From our experience, applying more resources would not solve the problem. The problem was a lack of an appropriate methodology, structure, and tools. Although many authors suggested ways of simplifying the approach (some examples were discussed in the previous section, while more will be discussed later), these did not deal with the level of detail we needed.

63 However, many authors did not even acknowledge these problems, as though the solutions could simply be \cranked out" (e.g., [129]). Certainly, something can always be \cranked out," but the results will not be the type of systematic, predictable approach we were looking for. From our experience, the results at this point were looking more like a combination of preventive design plus nd and x approaches than any methodical design process.

6.6.2 Synopsis of the Issues How to deal with the information explosion of assets x threats x vulnerabilities x safeguards. How to create assets lists that re ected the dynamic nature of the system. The system entities exist in many forms, at dierent times and places, and with many interdependencies. How any of this could be modeled with the risk management approaches, especially while trying to keep things at a large grain (e.g., with the assets groups and threat results approach). How to deal with the analogous problems for threats and vulnerabilities in a tractable manner. How to develop safeguards for a particular vulnerability, when the vulnerability, the safeguards, and even the assets may all have a wide range of temporal and spatial dependencies. How to record information to determine the global eects of the simple addition or removal of a safeguard, such as with what-if scenarios. How to record the interrelation of safeguards, threats, and assets, in terms of functional dependencies. How to see the big picture of the system, from within the details of the assets and threats lists. How to structure the mass of information that results.

6.7 Need for Dynamic Information

Possibly the biggest dierence between conventional risk management approaches and our needs was that they tended to deal in rather static environments with relatively sequential data ow patterns where each operation is handled by an isolated process. Our environment was inherently dynamic, distributed, and interdependent. In this section we discuss our numerous attempts to deal with this.

64

6.7.1 The Control Point Approach

One approach to adding dynamics to the analysis, and to reducing the information explosion problem was proposed in [80], as the need to \chunk" the system into manageable pieces. This employed control point analysis, originally developed in the early 70's (discussed recently in [81]), where the control points cover the phases of information ow. An example set of control points and information ow is shown in Figure 6.3. For each control point one must determine threats and descriptions of how those can occur, as with any risk management approach. They did oer some means of trying to contain the information explosion (e.g., methods of listing and relating threats and safeguards). However, in doing so, one again ends up with a very high-level picture unsuitable for our needs. It was also not easy to map the URSA system into the control points. There is a clean, sequential ow through the control points which is missing in the URSA system. In the end, it appeared that the single Data processing control point was actually the collection of our entire set of distributed system operations! It became clear that the control point approach was really limited to more classical data processing environments [81], especially where data are processed in one static location and where the security concerns are mostly outside of that processing operation. We did try and create our own version of control points. Static elements could readily be assigned to control points, such as data on a node, data on the network, or the node itself. Dynamic operations, however, were dierent. For example, a generic query-request operation may move from one node, across the network, be processed on one or more other nodes (perhaps using another network), with results returned to the original node. Although this whole operation could be made a control point, it was a large chunk to deal with. The analysis of exposures and controls would cover multiple computer systems and networks, aside from the extreme overlap with other operations. We tried breaking the operations down into a set of more static parts that each could be more localized. For example, the query request operation could be broken down into query request on node, query request in transit, query request at destination, and so on. In trying to assess threats and vulnerabilities of these entities, we encountered problems similar to those already discussed. The dierent control points have interdependencies, as do the safeguards. They simply can not be analyzed independently. If one does begin to include the dependencies, it becomes very dicult to structure and understand.

6.7.2 Data Flow Analysis

We attempted to employ data ow techniques to capture some of the system dynamics. Some risk management literature discussed ow diagrams [38, 206], but usually in terms of physical movement of data throughout an organization. For example, employee time cards are moved to payroll, where payroll checks are later distributed to the departments. Although similar issues are also important to us (e.g., software is developed o site, transported to the end-user, and stored locally,

65

| threat results | threat results --+-----------------+-----------large | / item1 | || assets | => each asset group => | item2 | each result can groups | can expand to | ... | expand to | \ itemN | || \/ peril1 peril2 ... perilM

Figure 6.2. Model for Simplifying Assets and Threats Analysis

Data gathering `--> Data input movement `--> Data conversion `--> Data input communication `--> Data receipt `--> Data processing

Figure 6.3. Control Points

66 before being installed on the system), our immediate concern was how to model the more complex ows which occurred during the operation of the system. Further, the ow diagram approach in the risk management literature generally was used only to isolate the dierent areas of concern. Then, the conventional methods were applied to each of these areas statically, still working from assets and threats lists. We generated a high-level ow diagram for the URSA system, initially showing the ow of programs and data into the system, as shown in Figure 6.4. On the left is the ow of application software, starting at the developer, being transported to the site, stored on site, and nally installed into the system. Similar ows are shown for database data, operating system software, and optional user software. The actual operation of the system is shown only at the highest level of abstraction, as an in use operation that both sources and sinks ows to/from the system (about as general as one can be). This is certainly not complete but was a starting point, if even for no other reason than drawing boundaries of our responsibility. For example, we could limit our concerns to the install or load phases and below. We have shown the software and data as one way ows into the system. However, any of the install or load operations may employ software or data that are already present in the system. Thus, we should show an additional ow from the system back to the install operation. A more correct picture of the load database operation is shown in Figure 6.5. In fact, this is not correct either, since data can also ow to and from the person which is loading data or software. This is not shown. It became clear that each of these operations, and many others, could all be described in a single ow diagram that employed variables that could be bound to dierent operations, as shown in Figure 6.6. However, of what use was this? How would we relate this generalized picture to the endless assets and threats lists? Furthermore, the system is actually a collection of nodes on a network, which immediately complicates the simple picture above. The operation does not just deal with input/output on one node, but can involve many nodes via the network, as shown in Figure 6.7. With this in mind, we tried to re ne the ow diagrams into more detail. For example, consider the act of running a program on a node. We see that many ows are possible: program and data pages move across the network or to/from local disks; input/output can occur with the disk, keyboard, display, or network; pieces of program and data will reside in memory, disk, on the network, or in external storage; and so on. In general, all of these ows are possible. As a result, the ow diagram describing these possibilities became almost completely connected! As the detail increased it became clear that we were essentially drawing the architecture of the computer system on which the program was running! In retrospect, this makes sense. The most general ow diagram would be precisely that, since all legal ows within the system would be possible. This was not very useful for our needs.

67

application software development | transport | on-site storage | install \ \ \ `--------> ,--| `-->

database raw data | transport | on-site storage | load database | v ,------------. | THE SYSTEM | `------------' ,------------. | IN USE | `------------'

OS software optional development user | software transport | | transport on-site | storage on-site | storage | | install install / / / / ,-----------. | Operation |----> External Inputs `-----------' Outputs | ^ v | ,-----------------------------------------. | Distributed System | | ,------. ,------. ,------. | | | node | | node | ... | node | | | `------' `------' `------' | | ____|___network___|______________|___ | `-----------------------------------------'

Figure 6.7. Operation Extended to Distributed Environment

69

6.7.3 Dependency Graphs

After more study, it appeared that the actual data ows may not be as important as the eect they had on the data. That is, perhaps some sort of dependency diagram for objects on a node would be more useful. The phrase \A depends on B" would simply imply that B can do something to aect A (e.g., A calls B and uses the results, or B can alter a state variable which A uses). In the example in Figure 6.8, upper entities would depend on lower ones. We hoped to just convert our assets into this format, and then apply our previous risk management approach. At each level we would list the results of an attack (release or modi cation of information), and then list the vulnerabilities that could lead to these, as demonstrated in Figure 6.9. This appeared more promising than our earlier attempts. We believed that more of the system structure could be seen, and that the vulnerabilities would be more obvious. However, a diculty emerged. We realized that an attack against a given level in the dependency diagram may not actually take place at that level. For example, modi cation of a program in memory does not take place by directly modifying the RAM. It comes from an access point, such as the keyboard, along with the aid of the operating system. Towards this end we re-did the dependency diagram. Previously we had listed vulnerabilities directly with each level. Now, with each level we instead list those levels at which a vulnerability can exist. For example, at the Program/Data level we might list the levels (Peripherals, Hardware, Network), since each of these levels can be involved in release of information from a program or its data. Then, for each 3-tuple of level, threat result, level at which a vulnerability exits we could list the possible attack scenarios. A simple example is shown in Figure 6.10. This approach looked promising since it did automatically show some known, obvious vulnerabilities. For example, stopping RELEASE and MODIFICATION attacks against the Hardware and OS/Kernel does not make the application secure, since the Network and other Peripherals may still harbor an attack. This motivates the use of a server node with minimal peripherals, where the node and peripherals are physically secured (only accessible to trusted personnel). Then, all that would remain is to secure the network access.

6.7.4 Problems with Dependency Graphs

It soon became clear that many real dependencies were simply not shown in our model. For example, data on disk (a peripheral) can be attacked via the OS/Kernel. This proceeds in the wrong direction; the lower entity is attacked from above. In fact, the attack from the OS/Kernel can actually be via the Keyboard (a peripheral), the network, or an application program running in memory (at the top in the diagram). None of this was visible in the diagram. By the time we tried to list all of these additional scenarios, this approach had become as complex as the previous assets, threats, and vulnerabilities lists, with all of the associated problems. The only bene t here was that we see a little more of the structure and dynamics of the system (which were completely missing in the previous approaches), but certainly not much more.

70

Application program (or application data) Operating system + kernel Node hardware Peripherals (keyboard, crt, disk, etc.) Network

Figure 6.8. Simple Dependency List Example

Item Program/Data OS/Kernel Hardware Peripherals Network

Threat results R,M R,M M M R,M

Possible vulnerabilities Vulnerability Vulnerability Vulnerability Vulnerability Vulnerability

1, 1, 1, 1, 1,

2, 2, 2, 2, 2,

... ... ... ... ...

, , , , ,

l m n p q

R = Release of information M = Modification of information

Figure 6.9. Dependency Diagram with Threats and Vulnerabilities

71

Threat Item

Attack can occur results via these levels

Program/Data OS/Kernel Hardware Peripherals Network

R,M R,M M M R,M

Hardware, Peripherals, Network Hardware, Peripherals, Network Hardware Peripherals Network

R = release of information M = modification of information How some of these attacks can occur (vulnerabilities), for each 3-tuple of {level, threat result, level of vulnerability}: {Program/Data, RELEASE, via Hardware} EM radiation pickup clandestine transmitter installed {Program/Data, RELEASE, via Peripherals} EM radiation pickup look over user's shoulder at display {Program/Data, RELEASE, via Network} eavesdrop on network {Program/Data, MODIFICATION, via Hardware} replace encryption chip with clandestine one ... {OS/Kernel, MODIFICATION, via Hardware} replace boot ROM with clandestine one ...

Figure 6.10. Full Dependency Diagram with Vulnerabilities

72 For example, even though we tried to include distributed eects in the dependency graphs, that analysis ignored the fact that the operations of the system were not just applied to a single node, but to a network of nodes. To address this, we tried to break operations down into phases such as user requests information and user receives information. Each had its own dependency diagrams with threat results, discussions of attack scenarios, and possible safeguards. We also tried to attach information to each safeguard, such as what it addresses and does not address, what it depends on, and what depends on it. Certainly one could write down and discuss all of these issues, the possible threats and vulnerabilities, and the rami cations of dierent safeguards. As with our previous eorts, the problem was how to do this in a structured manner and to understand the interdependencies. This became as complex as the earlier eorts, with the same diculties, while oering very little guidance in the design. In fact, it was probably more complex, since it was pointing out more of the detailed vulnerabilities in the system. The dependency diagrams oered little help in dealing with these issues. In a futile attempt to simplify this, we tried to collapse the dependency diagram down into just vulnerabilities. For example, consider that nothing directly attacks a program, data, or the operating system. They are attacked via other means such as the keyboard or network. Thus, why even list the program, data, or operating system? We could just list those things that are attacked, being left with only hardware, peripherals, and the network. This is certainly much simpler. However, this was not a good idea since one completely loses track of what is being protecting in the rst place, of what the purpose of the attacks are! After all, an attacker enters commands to the keyboard in an attempt to break the program it is talking to, not to break the keyboard itself.

6.8 Need for Detailed Dynamic Analysis

In addition to the need to incorporate dynamic operations into the analysis, it was becoming apparent that even the fundamental assets might be better understood if approached not from the low-level view of static entities such as disk drives and node hardware, but rather from the high-level view of system operations. A small number of risk management approaches did suggest this direction, discussed at the end of this section. For example, to secure an automobile, it seems reasonable to start with high-level functions such as entering, starting, or driving the vehicle, rather than starting with low-level components such as the door locks, ignition coil, or steering linkage. It would seem to be more dicult to work backwards from the low-level components in trying to understand their security relevance than to work from the high-level functions downward. We had, in fact, already considered such high-level functions from the beginning. The original assets list included services that the system provided. However, very little of the risk management literature dealt with this aspect in any detail. Most were quick to list and dissect the static, physical entities and attacks (e.g., a disk drive can be blown up, stolen, or replaced), but very little if any discussion was

73 devoted to the analysis of services such as restore data from tape, list directory, or issue distributed query. We have already discussed some of our attempts to incorporate such operational, dynamic issues. Implicit in our earlier eorts was the desire to keep the dynamic view as high-level (and simple) as possible, such as program runs on node, data gets to node, and program returns results. It was believed that once we understood how to handle the high-level view we could move into the more detailed issues. If we could not deal with the handful of issues in the high-level view, there was little hope of dealing with a ner grain, more detailed situation. However, it began to appear that some of our diculties might be due to the high-level view simply being too high-level (too vague) to deal with. For example, it might certainly be easier to analyze the threats against a single operating system command than against the entire operating system, even though the latter is much higher level (simpler) description (there are hundreds of OS commands, but only one OS). Perhaps the whole attempt to keep the view simple was actually making the analysis more complex. The problem was how to move from this view downward. We attempted to trace individual system operations. Starting with the query module we listed all user commands and then attempted to create a complete ow diagram for each resulting operation. This quickly became very complex since the system is a set of communicating state machines, and message routing and module functionality often depend on state maintained in dierent state machines. For example, the path taken by user nd operation may dier depending on user setable parameters. In the general case, the request is sent to the index which simply sends the results back to the user. However, if a thesaurus is enabled, the initial request may be routed to a thesaurus module which then forwards it to the index. If the automatic search feature is enabled and if the number of index hits is below the threshold set by the user, the index will forward the results directly to the search machine, which then returns the document list to the user. In both of these cases, the dierent ow paths are caused by actions that may have occurred at a much earlier time. The number of possibilities seemed to grow boundlessly. The need to support dynamic recon guration of the application also complicated this eort. Trying to manually list these ows was extremely dicult. Even if one could draw the data ow graphs for each possible operation, and then address each point in the data ow diagram by listing threats and exposures and designing safeguards, we would need a way to tie all of this together in order to analyze it and to deal with all of the interdependency. The information explosion issues were still present, only more so. A small number of the risk management articles did advocate a top-down approach from the high-level functional view. Although most of these were too high-level to be of use, the transaction ow approach in [165, 194] appeared to be closely aligned with our needs. However, they relied on manually enumerating and tracing the ow of all high-level operations through the system, and manually enumerating the exposures (vulnerabilities) during the design process. In the end, this suers from the same limitations we have already discussed.

74

6.9 Some Security Holes Provide Insight

During the eorts to develop a design methodology, we continued to analyze the security of the URSA system with brainstorming discussions and ad hoc analysis. The result was an increasing awareness of the types of problems we needed to address and of ways of addressing them. Three security issues in particular were instrumental in changing our point of view towards the design methodology. It appeared that nothing in our current design approaches would have oered any help in either discovering these issues or addressing them. These are now discussed.

6.9.1 Document List Access Control Filter

For a particular customer, we had previously built mechanisms into the URSA system that assigned each document a security class, and allowed each user to belong to any number of such classes. The security policy then stated that each user shall only see documents with classes contained in that users class set. Consider the following user interaction with the system. The user issues a nd command along with a query expression to the user interface, which sends this information to the index. After the index search has completed, it returns to the user interface two lists of document identi ers (DocIDs); a hit list for documents guaranteed to meet the query expression, and a maybe list for documents that require a full text search to determine if they meet the query expression. The user can instruct the user interface to retrieve or search any of these documents. Our rst step towards controlling document access was to add a simple lter function to the index, which was applied to the DocID lists resulting from the index search. It simply removed all DocIDs that were not in the users class set before returning the lists to the user interface. The rest of the index worked the same as before. Since we would also need such controls at the search and retrieval modules (e.g., to stop the user from guessing DocIDs), we moved this function into separate modules that could each be interposed between the user interface and any of these backend servers (the details are irrelevant for this discussion). With this lter in place, we wondered if a user could still infer information about documents outside of his class set, perhaps through timing issues or comparing lists sizes. Towards this end we employed a simple information ow analysis on a high-level description of the index DocID lter operation. This analysis showed us that it was indeed possible for some information about the inaccessible part of the database to be inferred through timing considerations. By measuring the time for the index to nish a search and lter operation for dierent queries (e.g., one expected to be true for almost every document in the database, and one expected to be true for very few documents), and comparing these to the time to process speci c queries, one can gain some insight as to the relative size of the database. For example, if a query for documents that contain my name returns no hits, yet takes substantially longer than a query that I expect would return no hits (e.g., search for a meaningless string of characters), I might be inclined to believe something is going on behind my back. Certainly, this may be of no concern, and the amount of information obtained may be insigni cant. However, this did point out the value of information ow

75 analysis, and made us wonder how this might be applied more during the original safeguard design process.

6.9.2 The Index Spell Command

In designing the above lter, we initially had overlooked the seldom used spell user interface command. The spell function allowed the user to search all words in the database for those that match a desired expression. No documents or DocIDs are returned, but simply a list of matching words. Clearly, one should not be able to obtain words from documents they do not have access to. However, our lter could not handle this issue since we had no separate database of words to which classes could be assigned. Although there were numerous ways of addressing this, the key issue was how to have avoided overlooking this in the rst place. How could we have seen this initially when deciding on the design of the original lter? In fact, what criteria did we use in designing the lter in the rst place? These are the sort of issues we wanted the design tools to help with. Again, it seemed as though some form of data ow analysis would be helpful here, since we would be trying to stop a ow from certain documents, through their words, to the index, and back to the user.

6.9.3 Auto Search and Retrieval Functions

The third situation came from analyzing the data ow for a simple query operation, and considering the auto-search and auto-retrieve features. With auto-search enabled, if the number of DocIDs on the maybe list generated by the index is below a user setable threshold, those documents will be automatically searched, with the results then returned to the user interface. Otherwise the hit and maybe lists are simply returned to the user interface as usual. Likewise, with auto-retrieve enabled, if the number of hits from the search machine is below a user setable threshold, those documents will be automatically retrieved. Otherwise the hit list is returned to the user interface as usual. These functions can be implemented in dierent places. For example, the auto search decision and threshold check can take place at the user interface, within the index access control lter module that sits between the index and user interface, or within the index itself. Depending on how these are done, data ow paths can be very dierent depending on results of queries, searches, and threshold settings. Could the user perhaps infer information by observing where responses come from? Although it was not apparent that any of these would lead to information leakage, it was disturbing that we had no good way of nding out. Further, we had no tools to help in understanding what the issues were before we even began designing the security mechanisms. This appeared to be another candidate for data

ow analysis.

6.10 A Note on Security Policy

We have not mentioned one important issue that is often completely overlooked in the risk management literature; the security policy. Asking if a system is \secure" is meaningless. The real question is whether it is protected against events believed

76 to be harmful [38]. A given system is only \secure" with respect to its enforcement of some policy [67]. This was discussed in Section 4.5.2 with additional information provided in Appendix B. In much of the risk management literature, security is discussed in terms of unauthorized release or modi cation of information, but the de nition of \authorized" is almost never discussed. This gives the impression that this either never needs to be further de ned, or the de nition can be divorced from the risk management design process. If the policy were simply the mechanism by which \authorized" was de ned then one should be able to replace it as required. That is, when asking if a particular access to some object is authorized, we are simply asking if the policy is upheld. Since the meaning of authorized will depend on the end-user needs, divorcing this from the design process sounds desirable. One could design a generic system, simply plugging in dierent policies as required. Since we were looking for a design approach that worked for a broad class of systems and supported a wide range of policies, this was desirable. We hoped to be able to determine assets, threats, and vulnerabilities, before ever discussing the speci cs of the policy, placing the policy de nition phase as late in the design as possible. At that time various representative policies could be introduced to see the eect on the required safeguards. Nothing in the risk management literature would lead one to believe this is not possible. However, policy can not be completely divorced from the mechanisms used to enforce it, even though one of the fundamental security design principles (Appendix F) indicates one should strive for this separation. The policy requirements can have a large eect, both on the types of threats encountered and on the design of security safeguards. For example, imagine the diculty of enforcing a true access control list policy if the underlying security mechanism was built only to support simple Unix le modes (based on only user, group, and world access). Even the fundamental task of determining assets can not be divorced from the security policy, even though the risk management literature almost universally began with assets lists created in full isolation of the policy. After all, whether an entity is truly an asset depends on whether it has any security relevance, which must depend on the policy since this is what de nes our meaning of security. This is but one more issue that misaligns conventional risk management techniques with our needs. Our speci c policy issues are discussed in Section 7.1.3.

6.11 Assessing the Problems at This Point 6.11.1 Overview

Although we did not state it as such at the onset, we essentially wanted to understand which elements of the system were vulnerable to which threats, at any point in time during their existence. In retrospect, this has to be dicult to do with only static assets descriptions (our initial attempts) since they simply do not model the dynamics of the system. Attempting to add the dynamics to the static lists did not appear to be the right approach. We then moved to large grain ow diagrams. Although these captured some of the system dynamics, they were lacking necessary detail in just which assets existed where, and when. The framework appeared to be too high-level to be of

77 use. It became very dicult to tie these ow diagrams back into the static assets and threats lists. Attempting to increase the detail in these ow diagrams resulted essentially in generic architecture descriptions of the underlying support systems (e.g., OS, node hardware, network), which was of little value. Our attempts at dependency graphs were equally unsuccessful. It was dicult to describe all the dependencies, and to incorporate the distributed characteristics. As with our other attempts, there seemed to be no good way to structure this information. It appeared that the concept of approaching assets from the bottom up (from static, physical entities up to dynamic system operation) was perhaps less desirable than approaching them from the top down. In the top-down approach the initial assets of the system would be the functions it is to perform. These are what we are really trying to secure. We would then need a way to move from these down to the physical assets. Towards this end we tried to produce ne grain ow diagrams of individual system operations. Given the nature of the system, these quickly became very complex. It was also very dicult to tie these back to the more static assets which we still needed to do (e.g., the threats against a node, a static asset, must impact all entities inside of it, or that ow through it). We started to believe that one might need to analyze the static issues (e.g., program or data on a node) separately from the dynamic issues (e.g., system operations), and then nd a way to tie these back together. It appeared that the top-down approach, and generating ne grain ow diagrams, were both in the correct direction, if the complexity could be managed. The previous approaches were also complex, but appeared to lack the necessary information even if they could have been managed.

6.11.2 Synopsis of the Issues

The dynamics of the system must be brought into the analysis. It is dicult to add dynamics to static assets and threats lists. Top-level ow diagrams are important to show overall dynamic patterns of the system. Flow diagrams can not increase in detail beyond a certain point, or they tend to become a general architecture description of the underlying support systems. Dependency diagrams did simplify some of the analysis (e.g. helping to see some interrelation of elements), but were very limited. Reducing threats and vulnerabilities to their results greatly simpli es the analysis. However, we somehow need to incorporate the actual threats and vulnerabilities at some point. Dependency diagrams were not well de ned. They really did not show the dependencies we wanted, and did not help us see the distributed nature of

78 system operations. They seemed to describe a static environment (e.g., the node) but did not help with the dynamics of the system. Dependency diagrams provide no indication of what objects one is trying to protect (e.g., at a node, there is no indication of what resides on the disk or in the memory). Perhaps dependency diagrams should be tied in with the higher level ow diagrams. The concept of approaching assets from the bottom up (from static, physical entities up to dynamic system operation) appears to be less than ideal. A top-down approach to assets may be the correct approach, where the basic assets of the system are the functions it is to perform. These are what we are really trying to secure. If we generate ne grain ow diagrams from a top-down approach, we need a way to move from these down to the physical assets. It may be useful to analyze static and dynamic aspects separately, then develop a way to tie these together.

6.12 Summary

This is a summary of our attempts to employ risk management techniques. What was learned from these eorts formed the foundation of our proposed design methodology. Many books refer to numerous checklists to aid in determining assets, threats, vulnerabilities, and safeguards, which can be quite detailed and speci c. It is instructive to investigate these, and build one's own for a given application, since they then serve as a baseline. One can always return to them to see if something was missed, and can add to them over time. However, attempting to directly apply these to the risk management approach, results in two major problems. The rst is the assets x threats x vulnerabilities x safeguards information explosion. This product grows rapidly. Much of the literature on risk management is an attempt to manage this information explosion. Certainly, this could be managed by appropriate software tools, but the real issue is trying to make sense out of it. This is the second problem. Once the system is decomposed into the detailed assets lists, it is dicult to understand how any of those assets relates back to the original functionality and dynamic nature of the system. This is lost in the decomposition. Given only the information detailed in Appendix C, one would be hard pressed to determine the purpose of the system it represents. In addition, it is dicult to analyze and understand the threats and vulnerabilities since they are generally not the static, localized entities that the risk management literature generally alludes to (e.g., re destroys data stored in warehouse, or disk with data is stolen). Most of the interesting situations from the perspective of of a distributed software system are very unlike this (e.g., circumventing network

79 encryption hardware at one node may allow release of information to a remote node, or planting clandestine software at time T may cause a security breach at a much later time). Further, many assets are simply not static; they can change both form and location rapidly. For example, viewing the threats against a static data le on a particular disk is one thing. However, when this le can quickly be read into memory, become a network message, and end up in memory on a remote workstation, the threats can vary widely. Of course, in each such isolated case, one can analyze the threats at each point. The problem is that the whole system is made up of many such assets, and analyzing them all in this manner leads to a very complex situation. For the same reason, it is very dicult to understand the eects of safeguards. Whereas the risk management literature again generally discusses these in static, independent terms (e.g., sprinkler system prevents re at warehouse, or placing disk in safe prevents theft), they can become very dynamic, distributed, and interdependent in distributed software systems. A number of separate safeguards may be required to address a given vulnerability, and any one safeguard may be helping to prevent any number of vulnerabilities. It is a many-to-many mapping. For example, a remote authentication server, a secure network, and a secure kernel on user nodes may all be required for correct user authentication. Correct user authentication coupled with appropriate software on backend servers and correct setting of le ACLs may be required to control document access. Correct user authentication may also be used by any number of other application functions (e.g., secure mail to/from other sites). It was extremely dicult to keep track of these interdependencies. Possibly the most fundamental dierence between conventional risk management approaches and our needs is one of static versus dynamic environments. Their view is generally of rather static environments with relatively sequential data ow patterns, where each operation is relatively independent, while our target environment is inherently dynamic, distributed, and interdependent. Those that did attempt to incorporate dynamic aspects generally did so only at a very high-level view, or where the dynamics resulted in a very simple (if not purely sequential) ow through the system. In all cases, little to no attention was given to the problem of dealing with the distributed, interdependent assets, threats, vulnerabilities, and safeguards. We made numerous attempts to integrate dynamic aspects into the risk management approach. Although none of these were successful in achieving our goals, we learned from each attempt. One result was the realization that while we desired to keep the design process at as high a level as possible, in some respects this can make things more dicult. For example, it would be easier to assess the vulnerabilities of a single OS command than to assess the set of all OS commands. We also realized that approaching the problem from the low-level assets, and then trying to map these back into the application (in order to understand what they mean), may be the wrong direction. Perhaps starting with the application functions and then moving downward to the involved elements was a better approach. As an analogy, to secure an automobile, it seemed more reasonable to start with high-level functions such as entering, starting, or driving the vehicle, rather than starting with low-level components such as the door locks, ignition coil, or steering linkage. It would seem to be more dicult to work backwards from the low-level components

80 in trying to understand their security relevance than to work from the high-level functions downward. This would imply an analysis of each application function, and since one of our critical problems was understanding the relationship of all the pieces in the distributed system, a ow-based analysis seemed appropriate. Also, as we continued to investigate security issues with the URSA system, and analyze some of the safeguards that had been designed in ad-hoc fashion (e.g., preventive design or ndand- x), it appeared that high-level information ow analysis techniques worked well in seeing the holes in these designs. However, our attempts to manually create such ow analyses proved futile. It appeared that, perhaps, integrating some form of automatic information ow analysis of the application functions, with what we had learned so far about the strengths and limitations of the risk management paradigm, was a direction worth pursuing. This has formed the foundation of our security design methodology.

CHAPTER 7 A SECURITY DESIGN METHODOLOGY This and the next two chapters describe the proposed design methodology. The methodology provides for the previously stated security design goals and alleviates many critical limitations of other approaches to this problem. The framework of the methodology provides the same fundamental structured approach as risk management. However, the methods and tools to achieve this goal are very dierent. The methodology consists of modeling all elements of the application and its operating environment in a high-level, object-based manner. A high-level information

ow analysis is then applied to this model through a simulation-based approach. This will dynamically de ne the assets and indicate where policy violations occur. From this information, safeguards can be designed to address the violations. These are also modeled as objects, but at three dierent levels of abstraction. This allows us to readily separate the security architecture from the application and its support environment, to assess the interrelationship of safeguards, and to play what-if games with all aspects of the design. The design process is iterative and evolutionary. However, although the application modeling is speci c to the target application, much of the support environment modeling can be done generically, tailored to speci c developers needs by simply setting variables. Because of this, small application developers can make use of environmental models which can evolve over long periods of time. These models can represent the collective security expertise of all those who have worked on them in the past, essentially becoming security knowledge bases. The reader desiring an overview of the design methodology may wish to read the summary presented in Chapter 10 before continuing on. This may provide more of a framework for the following discussions, if one accepts that much of it will be out of context. This chapter introduces the ow analysis approach, and describes the modeling of the necessary aspects of the application and its environment. Chapter 8 discusses the problems with realizing the necessary ow analysis in practice, and the details of the chosen simulation-based approach. Chapter 9 describes the method by which safeguards are designed, modeled, and included in the analysis.

82

7.1 Background

7.1.1 The Focus of These Chapters

This research has dealt with a wide range of issues, clearly de ning a complex problem which has been largely ignored, determining what appears to be required to solve this problem, and proposing a well-structured design methodology to achieve the solution. In these chapters a design methodology will be described, not the mechanisms by which that methodology should be implemented. We describe how the problem is approached, what the designer must be capable of doing, and the speci c requirements of the tools that are necessary to help carry this out. Throughout this research, we avoided focusing up front on any particular design paradigms or tool sets, and instead focussed on what we needed to be able to do. The constraints and requirements of a particular tool or paradigm can strongly in uence the way in which a problem is approached, perhaps coloring the solution to re ect what the tools best provide rather than what one wants to achieve (e.g., when all one has is a hammer, everything starts to look like a nail). Our one focus was the structured approach of the risk management paradigm, the reasons for which have already been discussed and are emphasized in the next section. Of course, we are careful to insure that the required mechanisms are realizable in practice. For example, Chapter 8 is devoted to understanding how the required information ow analysis can be carried out in a tractable manner. Throughout these discussions, numerous examples are employed to solidify the concepts. In general, the speci c approach taken to describe each aspect of the methodology was chosen because it appeared to do so in the most understandable manner. This does not imply that any of these approaches would necessarily be the best choice for implementation. For example, in discussing the need to specify the behavior of an object, axiomatic or operational speci cation techniques may be utilized, whichever is more appropriate for the particular circumstances. However, even though such common formal notations may be employed, this does not imply that a completely dierent approach would not be better in practice, such as graphical or illustrative [133]. Further, some issues are best presented using a simple declarative programming language, while others require only a very high-level qualitative analysis. Regardless of the chosen approach, numerous details will be omitted for clarity. For example, variable typing is dispensed with unless necessary, and in object-oriented descriptions the details of object instantiation and naming are often ignored. However, speci cation methods are only one part of the design methodology requirements. For example, database management tools will play a role in managing the design complexity, and a version control system will be required to help provide backtrack capabilities in the design process, as well as to provide accountability for design decisions. However, we have not attempted to specify which type of either one should be employed. Finally, the order in which the elements of the design methodology are introduced in these chapters was chosen to provide the best intuitive understanding of their purpose. It is not indicative of the order in which they are actually employed in the design process. In general, the design must be an iterative and evolutionary process in which all aspects will be involved.

83

7.1.2 Why a Risk Management Foundation?

Although this methodology has been designed entirely around the risk management paradigm, we are not implying that this is the only way to attack the security design problem. There are two fundamental reasons for basing the approach on risk management: 1) it is the only current approach that deals with the design problem in the desired well-structured manner, as discussed earlier, and 2) it has been used for thousands of years for this purpose. From the onset it appeared to be the correct logical way of approaching the security design problem. The diculty was that it could not be applied to complex distributed software systems in a tractable manner, especially while achieving the design goals. As a result of extensive eorts to apply the risk management approaches in this environment, we were discovering speci cally why they were failing, and also seeing ways in which they could be made to work. Thus, we chose to remain within this framework while searching for a methodology. It seemed reasonable to continue in this direction before discarding the paradigm and moving to an entirely new one. However, there may well be other approaches worth exploring. Some of these were discussed in previous sections, and some are discussed later as goals for future work. Following are two examples. First, the problems of designing safe and reliable systems have many similarities with the problems of designing secure systems. This was brie y discussed in Section 4.8. However, although attempts have been made at applying fault-tolerant system designs to the control of viruses and worms, there has been no attempt to apply such concepts to the entire secure system design process. This may be a worthwhile direction to pursue. In fact, fault trees, often associated with safe system design, are employed in our methodology. Second, while conventional software design methods fall short in the security design process, no eort was made to explore the possibility of modifying them to achieve our goals. As discussed in Section 4.9, although others have made eorts to integrate security requirements and designs into the overall software design process, there has been no attempt to achieve the goals we had set. This may be another worthwhile direction to explore. In fact, our methodology utilizes commonly accepted software development paradigms, such as object-based design and modular decomposition. Regardless of these other possible directions, exploring the oldest and most widely applied paradigm was a logical rst choice. Although our approach to achieving the risk management goals will be quite dierent from the classical aspects discussed earlier, the bottom-line goals will be the same; determine assets, detect vulnerabilities, and design safeguards to address the vulnerabilities.

7.1.3 Security Policy

The need for a clearly de ned security policy was discussed in Section 4.5.2, and the need to de ne such a policy early in the design process was discussed in Section 6.10. Until one can specify what is meant by unauthorized release or modi cation of information, it is dicult to even determine assets, not to mention assessing vulnerabilities or designing safeguards.

84 We employ a simple information ow-based policy, supporting both a secrecy and integrity policy. These can be viewed as very simpli ed ow versions of the Bell and LaPadula military security model (BLP) [67] and the Biba integrity model [87, 47, 28] respectively, which are brie y discussed in Appendix B. However, it is important to note that those models are based on state invariants (e.g., that the access control subject-object matrix enforce the policy in each system state), while ours is a true information ow model based on constraints on the actual information

ow resulting from state transitions.

7.1.3.1 The Notation

Each security relevant (SECREL) object possesses a secrecy class (SC) and an integrity class (IC). We refer to a given object's SC and IC as object.SC and object.IC respectively, and to this collectively as the security relevance value (SRV) of the object. The term SECREL will be used to discuss general security relevance (e.g., Is SECREL information present?), whereas the term SRV will refer to the speci c values of objects (e.g., The SRVs of objects A and B are identical). Each such class consists of a security level which can be partially ordered, and a possibly empty set of security categories. We represent this as classfcategory-setg, or simply as class if the category set is not important. The level and category-set part of any given class are referred to as class.level and class.cats respectively. For demonstration purposes we employ the conventional military secrecy levels of top secret (TS), secret (S), classi ed (C), and unclassi ed (UC), and the simple integrity levels of high (H), medium (M), low (L), and no-integrity (NI). For simplicity, we generally do not use categories in examples. The secrecy and integrity levels are ordered as:

TS > S > C > UC H > M > L > NI

A class X is said to dominate a class Y (X DOM Y) if the following two conditions are met: 1. the level of X is greater than or equal to that of Y. 2. the category set of X contains all elements present in the category set of Y. Thus, in mathematical terms, our DOM relation becomes: X DOM Y () (X:level Y:level) \ (X:cats Y:cats)

7.1.3.2 The Policy

Our secrecy policy is essentially the ow equivalent of the BLP \no read up" and \no write down" rules. This requires that information ow from object A to object B only if the SC of A is dominated by the SC of B. Logically, this says that information should not ow from a higher to a lower secrecy level, and that to receive information, the destination must belong to at least the same categories as the source.

85 The integrity policy is just the reverse. This requires that information ow from object A to object B only if the IC of A dominates the IC of B. Logically, this says that information should not ow from a lower to a higher integrity level, and that to receive information, the destination must not belong to any categories not also in the source. Thus, our entire ow policy can be concisely expressed as follows. Flow A ) B is valid I: (B.SC DOM A.SC) \ (A.IC DOM B.IC) A number of ow examples are given in Figure 7.1, with a description of how they either violate or uphold this policy.

7.1.3.3 Why This Policy?

Note that this may be considered a minimal policy for common commercial needs, although it may be adequate for many military situations. For example, it does not address denial of service; it is only concerned with unauthorized release and modi cation of information. These issues are discussed in Appendix B. As also discussed there, policies tend to become more complex in specifying the dierent requirements encountered in commercial applications. In general, the particular policy requirements will be largely dependent on the particular type of system and how it will be used. Since the purpose here is to discuss the design methodology, incorporating a more complex policy would only serve to obscure that goal. As long as the policy goals can be expressed unambiguously in terms of constraints on ows, then this simple policy can be replaced with a more appropriate one as required. For example, in [83], information ow policies supporting a wide range of requirements are discussed. Regardless of the exact form of the policy, its role in the design methodology will be the same.

7.2 Determining Assets

This is the rst step in the design process. In our earlier attempts to fully enumerate assets, every system object was viewed as an asset. Initially that was important in order to not miss anything. However, how can one decide which of those entities is or is not important in a security sense? The term asset implies that the entity needs to be protected, that it is security relevant in some way. One major problem is that we do not know what the real assets are when the design process is started. Although one may be able to point to obvious key assets such as database les and server programs, one needs a way to see the security relevance of every entity in the system. This includes all the messages exchanged among modules, the run-time environment of the modules themselves, any auxiliary les, and all physical equipment these rely on. Since the application was probably designed with no regard for security, such information would most likely not be recorded well. Even if such information were available, it must be in a form that can be used in the design process. In Chapter 6 we saw the problem with unwieldy static assets lists upon getting to the level of

86 detail required for design purposes. Furthermore, our early attempts at employing physical assets lists were failures. We really needed to view the assets in terms of the dynamic system operations, to see a relationship between the static assets lists, and what was going on within the system. Rather than starting from the physical environment (e.g., nodes, disks, networks), we found that it was much more eective to start from a high-level logical system description (e.g., the high-level design speci cation of the application before implementation). The motivation was the belief that if we could initially identify our fundamental assets (e.g., database les, users), then by tracing the ow of information throughout the application, we could determine all other elements of the system that either depended on these assets, or upon which these assets depended. These can be viewed as secondary assets. This is somewhat analogous to the dependency graph approach, discussed earlier in Section 6.7.3. For example, application database les can be considered fundamental assets, but the disk and node on which they reside are only assets as a side-eect. If no fundamental asset resided on (or depended on) the disk, then it would not be an asset either. Of course, if one did wish to consider the hardware alone as assets (e.g., because it costs real money or for denial of service reasons) then the policy would have to re ect this in some way. However, our policy is concerned only with release and modi cation of information that has been assigned an SRV (security relevance value). Thus, we will move from the logical system description to the physical implementation in determining assets. The way in which this is done is described in the following sections.

7.3 Information Flow Analysis (IFA)

In order to determine dynamic asset information, a high-level information ow analysis (IFA) is applied to abstract descriptions of the application modules. This allows one to see the ow of security relevant information into any system entity at any point in time. This is accomplished by rst creating a local information ow analysis (LIFA) for each application module, each of which is viewed as a nite state machine (FSM). A global information ow analysis (GIFA) can then be constructed for this system of communicating modules, viewed simply as a set of communicating FSMs (CFSMs). Individuals will generate the LIFAs for each module, while a ow analysis tool will produce the GIFA. Information ow analysis has been previously applied to the security veri cation problem [87, 58] and speci cally in covert channel analysis [101]. However, our use of it here is to aid in the initial design process, by providing us with a very dierent type of information and guidance. The dierence is equivalent to that between testing a design after the fact, and developing the design in the rst place. We need the ow analysis to help in the development process. As we will see, employing ow analysis for this purpose results in increased complexity and demand on the required tools. However, we will also see a number of isuses that balance this. First, this is to be a design aid, not a tool for precise formal evaluation as are many ow analysis approaches. Thus, it can be simpli ed

87 by allowing a large amount of abstraction in the speci cation and minimizing the level of detail. This is also in contrast to many uses of information ow techniques which lean in the opposite direction, such as code level ow analysis. We will see that such detailed analysis would be impractical for our goals. Further, the modeling is to evaluate the security aspects of the application, not to verify its correct operation (as would be the goal of many formal eorts). It is assumed that the application already works correctly. As we will see, by carefully choosing the purpose and requirements of our modeling enough abstraction and simpli cation can be allowed to keep this tractable. The LIFA and GIFA will be described with examples after a brief discussion of information ow concepts.

7.3.1 The Meaning of Information Flow

In this section the general concept of information ow is brie y introduced. A detailed discussion of these issues is presented in Chapter 8. Information ow is much more than just control ow among objects. For example, following a simple FIND operation from a user to the index and back, we see the control ow shown in Figure 7.2. However, although SECREL information certainly will ow from the index-db back to the user, it is doubtful that any SECREL information ows from user query expression into the index-db. Information ow is dependent both on control

ow and the interaction of the object variables (similar to conventional data ow). Classically, this can happen in a number of ways [58] such as: Explicitly by variable assignment from expressions involving other variables. Implicitly between variables involved in conditionals, and variables in the branches dependent on those conditionals. This can ow both from conditional variables to branch variables, and from branch variables to conditional variables. For example, consider this conditional:

if (x == 1) then y = T else y = F After execution, if we know that y is T, we can infer that x was 1. Before execution, if we know x is 1, we can infer that y will be T. Between variables in a function call and formal parameters of the function. This can ow both into the function and back out. Between variables in a function and errors or unexpected conditions that are reported back to the caller. All of the above also applies at the machine level as well, involving registers, pages of memory, and the program counter. In all of the above ows, assessment of the amount of information ow is often possible. For example, consider the following conditional:

88

SecrecyFlows S fag ) TS fag S)C C fa; bg ) S fa; b; cg S fa; bg ) TS fag IntegrityFlows H fag ) M fag L)H H fa; bg ) Lfbg M fag ) Lfa; bg

Policy Reason Upheld? Y es No Y es No

(TS S ) \ (fag fag) (C 6 S ) (S C ) \ (fa; b; cg fa; bg) (TS S ) but (fag 6 fa; bg

Policy Reason Upheld? Y es No Y es No

(H M ) \ (fag fag) (L 6 H ) (H L) \ (fa; bg fbg) (M L) but (fag 6 fa; bg) Figure 7.1. Policy Flow Examples

User1 ---> UserIF ---> Index ---> index-database -. | User1 |--------------------------| | Unplanned Operations | --------------------------

Figure 7.9. Environment with Planned and Unplanned Operations

102 user may be able to become superuser to the application. Or, by aborting the application nd operation the possibility may exist that document lter would be bypassed and un ltered documents would be returned to the user. Such situations are continuously being discovered for many large operating systems and utilities. These aspects will not show up in the LIFAs simply because they are not part of the planned operation of the application. These can be discovered through design testing, through actual use of the application, or even through formal development eorts such as speci cation at the code level. Although we do not anticipate many of these, they must still be considered. Thus, just as with the environment, we also break down the application into its planned and unplanned operations, as shown in Figure 7.10.

7.5.4 The Complete Hierarchical Breakdown

The full application plus environment planned/unplanned distinction is shown in Figure 7.10. The only aspect we have addressed so far is the application planned operations, modeled with the LIFAs. The remaining aspects are discussed in the following sections. As an example of an entity that spans all four areas, consider an application user. Initially a user is modeled with a LIFA, since the user is expected to enter a nite set of commands and respond in a well structured manner, just as any other FSM in the application. When operating according to its LIFA, the user is engaged with the application planned operations. However, the user may not always respond as expected. For example, the user can maliciously attack the application programs, perhaps by issuing an unanticipated command sequence to cause the application to run incorrectly. In causing the application to run in an unplanned manner, the user is now engaged with the application unplanned operations. Outside of the application, the environment may include common utilities such as a text editor, mail program, and news program, which the user is allowed to access. A user employing any of these services, such as sending mail to a friend or editing a disk le, is now engaged with the environment planned operations. However, as with the application, the user may interact with the environment in unplanned ways. For example, the user may exploit an OS bug to become superuser, or replace system libraries to circumvent le access controls. Such operations are not part of the expected operation of the environment, and a user undertaking such actions would be engaged with the environment unplanned operations. Thus, a user can operate in any of the four views. Note, whether any of these operations causes a security violation will depend on the ow analysis. All we are showing is how a single entity can be a part of all four views.

7.5.5 Signi cance of this Breakdown

The motivation for breaking the problem down this way came as we got further into the development of the design methodology. The farther we moved away from the application modules towards all of the other issues that had to be considered, the more dicult the modeling aspects became. Each step away from the application resulted in the need for more abstraction, and also resulted in increasing complexity.

103 By creating the planned/unplanned distinction and applying it independently to the application and its environment, we have provided much more structure to the problem of discussing the seemingly endless vulnerabilities, which as we saw in Chapter 6 can be very dicult. Starting with the application planned operation gives a solid foundation that is easy to produce. The planned operations are de ned entirely by the application code (and its mapping into pseudo code). The environment planned operations, while not nearly so automatic to generate, are still reasonable to produce since we generally understand the high-level services that are expected from entities like an OS or lesystem. These are both in contrast to the unplanned issues, which encompass everything else that can possibly happen, and which are certainly not as easy to model. However, the better we can explain the planned operation of any given system, the less we have to say about its unplanned aspects. Each security problem covered in the planned operation is one less that must be addressed from the unplanned perspective, which is a much more dicult direction to work from. By the time we have to model the unplanned operations, we have eliminated as much extra baggage as possible via the planned operation modeling. Note, the line between planned and unplanned operations in the environment will not be as clean as the one bounding the application planned operations. That one was easy to draw, since once the LIFAs were created the unplanned operations and the external environment were simply everything else. The line was automatically drawn by the de nition of the application. The line between planned and unplanned operations as we get into more detail of the environment will be much more nebulous. However, this is not a Procrustean bed. This is a conceptual tool to be applied as needed, one more attempt to provide structure to an otherwise unwieldy problem. Some aspects of the environment may easily fall onto one side or the other of the planned/unplanned line, while other aspects may never feel correct. However, it is not the precise location where the line is drawn, but merely that we can draw it at all that helps in the process of describing the environment by allowing us to partition the problem in more manageable pieces as needed. Of course, in the end, the total of what is said should be the same as if we modeled everything in one step. It is the process of getting there that we are simplifying. This divide-and-conquer approach will aid both in building the models of the environment and in simplifying the ow analysis. An additional bene t of this partitioning is that safeguard designs (Chapter 9) can be clearly limited to a subset of the total system. For example, in a nonhostile environment requiring a very low level of security assurance, one may decide to address only the application planned operations and not get involved in the more complex application unplanned and environmental issues. This may be sucient to stop many forms of error and casual attacks and may be adequate for many applications requiring minimal security. For more security one could also include the application unplanned and the planned environment, while still leaving the unplanned environment unaddressed. Although it is desirable to initially specify all such issues before deciding which are not of concern, pragmatic limitations such as time or personnel may dictate that the line be drawn earlier.

104

7.5.6 The Remaining Sections

In the next section we describe the importance of viewing all of these aspects as pure information ows. We then move to the modeling of the environmental planned and unplanned operations (Sections 7.7 and 7.8), leaving the application unplanned operations for last (Section 7.9). We leave this for last because the environmental aspects will provide more than adequate context for understanding the design method, and are where most vulnerabilities of interest will be found. The application unplanned aspects will be seen as simple in comparison and as such, very little time is devoted to them.

7.6 Viewing the World as Flows

It is important to note that all of the operations that can occur throughout the world view only represent possible problems. Whether any of these is a real problem depends on whether they cause a policy violation, and this can only be determined from a ow analysis. Although the LIFA models of the application provided the information ows for the application planned operations, attempting to model the remaining operations with the same paradigm proved dicult. For example, what would the LIFA look like for the action of replacing a trusted boot ROM and rebooting a node, for forging a network message, or for a user logged in under duress? These were not as straightforward as mechanically mapping code to pseudo-code to FSMs. Our early eorts to model these aspects were hampered by aligning ourselves with approaches taken in conventional risk management. As discussed earlier in Chapter 6, one sees endless lists of possible vulnerabilities such as \disk is subverted," \boot ROM is replaced," or \network messages are forged," without any notion of how these actually impact the application. One seldom if ever sees the policy even mentioned in these listings. However, without understanding how these can or can not cause a policy violation they are of absolutely no use. For example, the only reason we care if a boot ROM is replaced, or a network message forged, is if these in some way cause a policy violation. Otherwise they are of no consequence in terms of security (remember, the policy is our de nition of security as discussed in Sections 4.5.2 and 6.10, and Appendix B). Our solution to this problem is to insist that all aspects of the system be modeled purely as possible information ows. This includes both the planned and unplanned operations of both the application and the environment. Then we can apply the same information ow analysis across the board to discover policy violations. The following examples demonstrate how environmental operations can be mapped into information ows. The act of replacing a boot ROM is modeled as a ow of information from a person into the current boot ROM, since essentially the person is altering the boot program. The act of forging a network message is modeled as an information ow from some node or person into the network and attached nodes. A user logged in under duress is modeled as a normal user who is logged in plus an additional ow of information between that user and the person causing the duress, since this is precisely what is occurring.

105 Thus, we adopt the terminology of planned and unplanned information ows to represent the planned and unplanned operations. By insisting on this common view, everything in the system is reduced to the same fundamental ow concept. The same ow analysis applies everywhere and there is no ambiguity as to what does or does not cause a policy violation. Although conceptually simple, this was not a trivial task, especially considering countless possible actions that can occur and the need for a tractable approach that could be realized in practice. However, this is a key element of our design method, providing a necessary solid framework that was otherwise missing. The details of how this modeling is carried out are discussed in the following sections.

7.7 Environmental Planned Operations

In this section we address the issue of how to model the environmental planned information ows. This is the expected operation of essentially everything that is not the application, such as the operating systems and nodes the application runs on, the physical rooms in which they reside, and the personnel involved.

7.7.1 The Key Modeling Concepts

There are fundamental dierences between this task and the LIFA modeling, resulting in the need for modi cations to our approach. In this section we discuss these dierences and other issues involved, and then introduce the key concepts that are critical in successfully modeling these environmental information ows. Details on each concept and how it is employed in the modeling are presented in subsequent sections. Many of these concepts will also be utilized in modeling the unplanned operations as well. Remember that the application LIFAs were directly derived from the already existing application by simply mapping the well de ned operations of each module into the format of our model. For the environment we have no such preexisting design; it will have to be created. Further, the elements of the external environment are numerous and diverse, essentially consisting of everything that is not the pure application. This can essentially be viewed as the universe, whereas the application was only a small well-bounded subset. Without appropriate simpli cations this is clearly a much more complex issue. It is also important to realize that these environmental elements are not necessarily monolithic objects residing at one location. The communication system, a network lesystem, or a distributed operating system will all exhibit distributed

ow characteristics that we need to see, just as we needed to see these with the distributed application modules. For example, a network lesystem poses a potential vulnerability to objects on every node it runs on, since use of a le access function on one node can result in a ow to/from a le on any other node in the group. We must be able to model such ows. Further, we must not only be able to see these possible ow problems, but we must be able to see exactly how these aect speci c objects within the application. For example:

106 If a particular network link is vulnerable to tapping, which messages to/from which objects does this aect? If a particular node is not secure against software tampering, which application modules or data will this aect? How are data les on a particular disk drive vulnerable, and if they are, what application data does this represent?

This was a major problem with our earlier risk management attempts. There, we could see possible vulnerabilities but were then unable to relate them to the speci c application assets. We must also be able to readily change these environmental factors independent of each other and of the application. For example, we may wish to see the eects of changing the type of physical network (e.g., from coaxial to ber optic cable), the physical environment in which the network exists (e.g., from secure in buried conduit, to nonsecure in the overhead plenum), or the physical environment of any particular node (e.g., from a secured room to a public bullpen or to an unknown room somewhere on the Internet). We also need to be able to apply our methodology to dierent target environments, hopefully without having to redesign every element. We have discovered a number of key concepts that are critical to successfully modeling the required objects in a tractable manner while allowing us to integrate these models with the existing LIFA models. These are brie y listed here, and each is discussed in detail in the following sections. The utility of employing a layered, object-based approach to the modeling task, with a exible scheme for binding simpler elements together to create necessary target environments (Section 7.7.2). The important distinction between environmental elements that provide an explicit service to application objects, and those that provide implicit services to all objects (Section 7.7.3). How to model environmental objects that provide explicit services to the application (Section 7.7.4). How to model environmental objects that provide implicit services to all objects. This is accomplished with a novel approach where environmental objects can be viewed as parasites on other objects, resulting in greatly simpli ed global information ows to/from those objects (Sections 7.7.5 and 7.7.6). The important dierence between internal and external security characteristics of the environmental objects and how to model these. The use of external security characteristics results in greatly simpli ed descriptions of the environment (Section 7.7.7). How to simplify the modeling of necessary distributed characteristics of some environmental objects such as a network lesystem or distributed OS (Section

107 7.7.8). How to extend our method of modeling software to include the nonsoftware physical environment, such as node hardware, physical networks and rooms, and personnel (Section 7.7.9). How this more complex modeling task is oset by the reusability of the resulting objects (Section 7.7.11).

7.7.2 A Layered, Object-based Approach

We will model all aspects of the environment, including software, hardware, personnel, and physical rooms, using a layered object-based approach in which simple objects are combined as required to create more complex environments. Our goal is to model these elements as independently as possible, allowing us to create simple generic pieces that are bound together as needed to create the necessary target environments. For example, an operating system object may be layered on top of the objects representing both the communication system and node hardware. The application LIFA objects that run on this platform would then be bound with these. All of these objects would be bound with the objects representing the speci c physical environment in which this node exists. Some speci c examples will explicate this approach. The application index module, in order to communicate with other application modules, may be modeled as issuing calls to generic communication routines called com.send() and com.recv(). This allows it to be developed independently of the underlying communication support software. Similarly, the com object that it is eventually bound with may handle name/address issues and message packaging, but then simply call on generic net.send() and net.recv() functions to actually deliver the message. This allows us to support dierent types of underlying networks, such as Ethernet or token ring, by simply binding the appropriate net object with the com object. As another example, consider that two very dierent operating system models may exist. SecureOS() may be built on a two state processor with memory management (e.g., representing a workstation) whereas NonSecureOS() has no internal protection (e.g., representing a PC). However, each of these may be de ned to provide identical application level services. Thus, we can model an application running on a secure or nonsecure system by simply binding with the appropriate operating system object. The following three scenarios help demonstrate the need for this type of exibility. An index may run on one type of operating system, with a particular type of node and room environment, whereas a user interface may run on a dierent operating system, on a dierent type of node in a dierent room environment. The type of network connecting the index node with the user interface nodes may be dierent from that which connects the other backend servers with each other and with the index.

108

The physical environments of these dierent networks may also be quite dierent. The backend server network may be physically accessible only to cleared personnel, whereas the other lays on the oor in an open bullpen.

To specify the necessary bindings we will use a simple global binding table (GBTable) which also provide a clearly visible record of the system structure (critical both during and after the design). Please note that both the GBTable approach and our overly simpli ed syntax were chosen solely for clarity of examples; any implementation would certainly require more attention to detail and may well employ a very dierent model. A simple GBTable example is shown in Figure 7.11. Note, we defer discussion of how physical elements like node, rooms, and personnel are included in these bindings until Section 7.7.9. Following is a brief description of what this GBTable represents. The index (Index) has its os.* calls bound with a secure operating system (SecureOS), and its com.* calls bound with an interprocess communication package (std-ipc). The user interface (UserIF) has its os.* calls bound with a nonsecure operating system (NonSecureOS), and its com.* calls bound with another instance of the same std-ipc that the index uses. std-ipc has its net.* calls bound with a single Ethernet module which provides the necessary physical communication between Index and UserIF.

Breaking the environment down into these independent pieces helps make the modeling process more tractable in two ways. First, it allows us to develop more simpli ed generic pieces that are combined to build more complex environments. Second, it allows each element to be modeled at only the level of detail required, which can greatly reduce the complexity. Examples of this will be seen throughout later sections. These objects will be de ned in the same manner as the LIFA models of the application. The LIFA approach allows us to model any necessary distributed characteristics of these elements (e.g., networks, distributed operating system, or network lesystem). More importantly, this allows the ow analyzer to calculate all possible SECREL information ows among all system objects, regardless of whether they are the application or its environment, in a uniform way.

7.7.3 Explicit vs. Implicit Service Providers

We have found that an important distinction must be made between two classes of planned environmental object: Those that provide an explicit service to the application objects. Those that do not. That is, those that provide only implicit services.

109

-----------| World View | -----------|| \/ --------------------| Application | --------------------| Environment | ---------------------

-----\ -----/ -----\ -----/

--------------------------| Planned Operations (LIFA) | |---------------------------| | Unplanned Operations | =========================== | Planned Operations | |---------------------------| | Unplanned Operations | ---------------------------

Figure 7.10. Application and Environment Planned/Unplanned Distinction

GBTable: Index() os BINDS-TO SecureOS com BINDS-TO instance-of(std-ipc) UserIF() os BINDS-TO NonSecureOS com BINDS-TO instance-of(std-ipc) std-ipc() net BINDS-TO Ethernet

Figure 7.11. Global Binding Table

110 As an example of an object that provides an explicit service, consider that the application modules communicate via messages. We will need to model the way in which these messages are handled (e.g., the communication calls and the physical network which delivers them to remote destinations) in order to calculate ows among the objects as well as to allow the model of the application to work correctly. That is, we must know what can possibly go where, in order to generate the global

ow analysis. Another example is a lesystem, in which the application LIFAs include the writing/reading of disk les and our environment object must provide this service. Note that use of such services is not limited to the application. Other processes and users of the node can also use them. For example, the communication system and physical network may be used by any other process on the node to exchange information with any other reachable node. This must also be modeled since it is part of the planned operation of the environment. The implicit service providers are other environmental objects that are included only to see their security impact, but that provide no direct service to the application LIFA models. An example of this may be the OS at a node. Certainly, the OS provides many services to the application at runtime, but none of these may be explicitly included in the LIFA. The only services the application may explicitly use are those of the communication and lesystems. Another example is the node hardware. There will likely be no reference to registers, busses, and replaceable disks in any of the application LIFAs or the OS description. These are just too detailed for our purposes in those models. However, these objects have a strong impact on security (e.g., the ability to replace a boot disk) and therefore must be included in our analysis. We will see later how they are included in a much higher level way. Just as with the explicit service providers, use of such implicit services is not limited to the application. Other processes and users can make use of them. For example, a single-state OS may be used by any process to read or write memory pages of any other process, or to send arbitrary signals. As another example, nonsecure node hardware may allow an intruder to alter the process status register, replace a boot disk, or tap onto the internal bus. Thus, we need to model two dierent types of service providers, and how they interact with both application objects and nonapplication objects: Modeling the explicit service providers and their interaction: { With application objects. { With nonapplication objects. Modeling the implicit service providers and their interaction: { With application objects. { With nonapplication objects.

In the next section we explain the modeling process for explicit service providers and their interaction with application objects. Following that we discuss the

111 diculties with the remaining issues, and introduce the novel concept of parasites which allows the modeling of the remaining issues in a tractable and unique manner.

7.7.4 Objects Providing Explicit Services to the Application

These services are modeled as FSMs using the LIFA approach, just as if they were additional application objects. The object will become part of the application GIFA. Without it, the GIFA simply can not be calculated (e.g., the communication system that links application objects must be present in order to calculate ows among those objects). As with the application, some objects may be monolithic and others distributed. The fundamental dierence from the application LIFAs is that we must rst design these before we can model them, whereas the application was already designed for us. For example, consider the communication system to handle the application message passing. A very simple model might employ FIFO queues for each destination, with send() and recv() calls simply queuing and dequeuing messages. Such a model may consist of two parts, a communication (com) object and an underlyine network (net) object, as shown in Figure 7.12. To send a message, the com object maps the destination name into an address (e.g., a fnode,processg pair). If the node is the local node, it simply enqueues the message on the queue for that proccess. Otherwise it calls the underlying net.send(). The net object models message delivery in the same manner, with a queue for each fnode,processg pair. Receiving a message involves simply dequeuing from the appropriate queue. Clearly this can get more complex such as adding recv any() or remote procedure call (RPC) functionality, but the concept should be clear. A shorthand C-like notation is used for further simpli cation. As another example, consider a nondistributed lesystem model that simply provides information ows to and from the designated le object, without actually storing any other information in the les. This may be all we need to determine if ow violations occur, the actual data in the ows being irrelevant. Shown in Figure 7.13, note that the le object must be a state variable that persists across the function calls in order to accumulate the ows over time. The create() and zero() functions insure that this state begins empty. Extending this to a distributed lesystem could be done by adding a remote le server object and layering this on top of the previously de ned net object, as shown in Figure 7.14. To write a le, the local lesys rst determines its physical location. If local, it operates identically to Figure 7.13. If remote, it builds a message and calls the net.send to deliver to the remote server which then operates as a local access. Reading a local le is again identical to Figure 7.13. Reading a remote le requires building a message, calling net.send() to deliver this to the remote server, calling net.recv() to get the server response, and unbuilding the message. Again, many minor details have been omitted to avoid needless complication, such as verifying that the received message is really the reply (or, employing remote procedure calls), and le creation.

112 com: send(to, msg) address = get_address_by_name(to) if is_local(address) enqueue(queue[address], msg) else net.send (address, msg) recv(from, *msg) address = get_address_by_name(from) if is_local(address) msg = dequeue(queue[address]) else net.recv (address, msg) net: send(address, msg) enqueue(queue[address], msg) recv(address, *msg) msg = dequeue(queue[address])

Figure 7.12. Model of Communication System

Filesys: write(file, data) state: file file X B -> X X -> C

Assume also that PChecks need only be performed on ows to the users, but not to the state variable X (this is discussed later). If A.SRV=S, B.SRV=TS, and C.SRV=S, intuitively we see that a ow violation is possible from B through X to C (from TS ) S). However, looking at individual traces the occurrence of a violation will depend on the particular event orderings. For example, if the events occur in the following order no ow violation occurs: A.S -> X X.S -> C.S B.TS -> X

X now contains S info VALID since flow is S to S X now contains S and TS info

However, if they occur in the following order, a ow violation occurs. B.TS -> X X.TS -> C.S A.S -> X

X now contains TS info VIOLATION since flow is TS to S X now contains S and TS info

153 Thus, we must understand how to deal with this issue, and each possible concurrent operation exacerbates the problem. If such operations can run independently, or even partially so (e.g., two independent users, or a database loader running while the system is being used), the result can be an explosion of possible event interleavings and resulting possible traces. This is intuitive if one considers that each such unsynchronized concurrent event can occur at any state in any of the other traces. For example, user A can issue a nd at any point during user B's entire interaction with the system. Not only are all such starting points possible, but the resulting trace can then be interleaved with all other traces. Even if the application did not exhibit much concurrency or could be modeled without it, the environment (e.g., the operating system, lesystem, communication system) is guaranteed to add asynchronous concurrency, since it is also modeled as a set of CFSMs which interact with the application. In the real world the communication between these CFSMs and the application can be very complex (e.g., via all the channels between the OS and each application module). While we have greatly reduced the number of ow paths by employing the parasite ows, the resulting complexity is still large. For example, a parasite ow from application module A to the OS bound with it can occur at any state in A. If this is a distributed OS, it consists of a set of FSMs with its own set of possible traces to/from other OS modules bound with other nodes. Thus, the parasite ow from A to the OS does not stop at the local node, but ows through all possible traces in the distributed OS. However, the ow need not stop at the other nodes, since they may have possible parasite ows to all modules bound with them, such as other application modules. If these are involved in asynchronous concurrent operations they can be at any of their possible states. For a simple quantitative example, consider the following. If just one parasite

ow path existed from node 1 to node 2, and just one application module was running on each node, the number of event interleavings due to this parasite ow path is an M x N product (where M and N are the number of states in the two application modules). However, any two such modules will likely be a part of a much larger picture, greatly increasing the number of states (these modules will be involved in just a small number of the total possible number of states). Further, the larger picture will probably involve other modules on other nodes, with their own possible parasite ow paths between them, possibly including the original two nodes. The explosion in the possible number of event interleavings is exempli ed in [107], where Hoare discusses the number of possible traces in the simple Dining Philosophers problem. By rst determining an upper bound on the number of possible states (approximately 1.8 million), the number of possible traces in the system is seen to exceed two raised to the power of 1.8 million! Not all of those traces will be interesting, or even attainable, but determining this is dicult. In any case, a distributed information retrieval system, even modeled in the high-level manner we have described will have much more complexity, especially considering the eects of the external environment objects such as distributed operating systems and lesystems. Although many of these ows may be uninteresting, we do not know which ones, since whether or not a ow is interesting depends on

154 what ows through it and how this relates to the policy. This is what we are trying to determine.

8.3 Approaches to the Flow Analysis Problem

In order to better understand the complexity and ways of dealing with it, we investigated dierent approaches to problems that appeared to involve similar

ow issues. In the following sections we examine these, their similarities to and dierences from our needs, and the problems encountered with their use. We then discuss our simulation-based approach.

8.3.1 Security Information Flow Analysis

Information ow analysis has been applied to secure systems for a number years. However, the analysis is generally aimed solely at the veri cation of an existing design, not at the design process itself. Furthermore, it often relies on a transitive analysis where individual state transitions can be analyzed in isolation without concern about possible event interleavings around those transitions. For example, if the two ows A ) B and B ) C are independently known to not violate the policy, then their composition will also not do so (note, while such composition may well violate higher level policies, this is not a relevant issue for those analyses). This is fundamental to ow veri cation of large systems, since any attempt to verify all possible ow paths would be dicult, if not impossible [87]. Although our simple ow policy (Section 7.1.3.2) happens to be transitive, other interesting policies are not always so, as pointed out in [82]. Nontransitive examples include Chinese wall [36] and con nement policies [82]. A Chinese Wall policy, for example, is a form of aggregation policy where access is controlled based on aggregates of data sets. More importantly for our purposes, even if we could guarantee a transitive policy, the transitivity principle in the analysis is only useful if we can independently validate each ow. This requires that the source and destination SRVs of each ow be known explicitly, or that ow constraints based on these SRVs exist. This is fundamental to most ow analysis eorts, and is also the cause of much manual eort involved in their success. In many cases, the source and destination SRV are assumed to be static. For example, we may know that a ow from A ) B is between two disk les, and that each is labeled with a SC. If we then constrain the ow such that A ) B I: (B DOM A) then it is independently seen to uphold the policy. If each ow is analyzed this way, transitivity insures that the policy will be upheld under the composition of all such

ows. Another example can be seen in state based policy models such as BLP, where the system state is comprised of a set of objects plus an access control matrix de ning allowable access among those objects. One must verify that each atomic system operation, when initiated from a secure state, will then terminate in a secure

155 state. All that remains is to de ne a secure initial state and transitivity insures that any combination of operations is secure. However, we are not able to fully de ne the system in these ways. In fact, we wish to start with essentially zero knowledge about the SRV at most every point in the system. The original purpose of our ow analysis was to tell us where the SECREL information ows were and where these may be a problem, not for us to guess at SRV values and unnecessarily assign them up front. Our desire for tools to tell us what is happening is quite dierent than assigning SRVs beforehand and then asking the tool if that particular marking upholds the policy. This makes the veri cation approaches generally inapplicable to our needs. There is another fundamental dierence between the veri cation eorts and our needs that actually helps to simplify the problem for our purposes. By their nature, veri cation eorts must be comprehensive and mathematically provable. Their purpose is to prove that a given system design will operate as expected. Our goals are not to prove things, but instead to oer guidance in the early phases of the design process. We want to see the eect of the possible ows in the system before any security aspects have even been designed, as well as during the process in which they are incrementally designed into the system. Certainly, as the security architecture evolves, there is nothing to preclude the application of such formal approaches to further re ne the analysis. However, this is not required at this stage of this research, and is independent of our goals. Also, as mentioned previously, the resources required for formal veri cation are far beyond those available to most application developers. Thus, the increased complexity we inherit by leaving many things unspeci ed at the start of the design process is somewhat oset by our relaxed requirements on what the tools have to provide. This tradeo will help to make our eorts tractable.

8.3.2 Data Flow Analysis in General

Moving away from the security arena, the general issue of data ow analysis in complex systems is certainly not new. There are many well understood techniques for dealing with this, usually to aid in resource allocation (e.g., compiler optimizations). Our information ow issues are clearly similar. These techniques are generally geared towards calculating cumulative information, generally of two types [145]: 1) given a point in a program, what is the set of de nitions that may be valid there (forward ow problems), and 2) given a point in a program, what is the set of later de nitions that can be aected by the de nitions that exist at that point (backward ow problems). By replacing the notion of \de nitions" with the notion of \information dependencies" (as employed in our LIFA development), such calculations could conceivably tell us about aggregate information ows, such as determining the total set of SRVs that can ow into a particular state variable. However, much of our interest in the initial design steps is to see the results of individual ows, not aggregates. For example, assume a local variable within the index receives a ow from each user's query expression and then sources a ow outward to a per-user state variable (e.g., to keep an audit trail), as shown here.

156 recv(user, query, ...) tmp

security design in distributed computing applications - CiteSeerX [PDF]

Recommend Stories

Idea Transcript

Helpful Links

Smile Life

Get in touch